Human Language, Animal Talk, and Speech Tech: Lecture Highlights

Name: Video 5aSfMh9ittI
Uploaded: 2026-04-01T20:32:49.285796+00:00
Channel: The Royal Institution
Description: Summary and key takeaways on Video 5aSfMh9ittI: Summary & Key Takeaways, covering The Nature of Human Language Language serves as a powerful tool for conveying

The Royal Institution

Apr 01, 2026

•

2 min read

YouTube video ID: 5aSfMh9ittI

Source: YouTube video by The Royal Institution — Watch original video

PDF

Language serves as a powerful tool for conveying precise meaning, yet a single sentence can carry multiple meanings depending on emphasis and context. It extends beyond spoken words to include writing, signing, and coded signals such as Morse code. As one speaker put it, “We share the content of our minds our brains whenever we want to speak or rap to anyone.”

Animal Communication and Cognition

Songbirds

Songbirds such as zebra finches and canaries produce structured sounds with rhythm and pitch. They can learn over 1,000 different songs, but they do not rearrange these elements to create new meanings.

Parrots

Parrots can learn human words and use them with appropriate timing and emotional context, though they primarily mimic to show off or gain attention. An Amazon parrot may use up to 80 distinct sounds or words.

Dogs

Dogs can associate names with objects; a dog named Gable recognizes more than 150 toy names. Brain‑imaging studies show activation patterns similar to those in humans when the dog processes spoken words.

Chimpanzees

Chimpanzees employ 60–70 distinct gestures to communicate intent and sometimes combine gestures into sequences. These gestures lack the complex grammatical structure of human language. Researchers determine meaning by observing when a signaler stops gesturing after the recipient responds, indicating the goal has been achieved.

Brain Mechanisms of Speech

Humans and songbirds share similarities in the genes and brain areas that control vocal learning and production. The left inferior frontal gyrus is critical for planning and controlling human speech. Transcranial magnetic stimulation (TMS) applied to this region can temporarily disrupt speech, illustrating its central role. As a speaker noted, “It’s not perhaps just about the raw processing power of the brain, maybe we need to think about the operating system as well.”

Decoding and Processing

Human Speech Recognition

Continuous speech contains no physical gaps between words, so listeners rely on context and prior knowledge to “hear” the gaps. “If you don't understand the words you don't hear the gaps,” one quote reminds us.

Computational Speech Processing

Computers break sound streams into small chunks, compare them against a library of speech sounds, and use large probabilistic language databases to guess word boundaries. This approach underlies modern speech‑recognition systems.

Role of Intonation and Emotion

Intonation—variations in pitch, speed, and melody—is processed largely by the right hemisphere and adds a secondary layer of meaning that clarifies, emphasizes, or conveys emotion. Modern digital assistants such as Ollie are beginning to integrate both semantic (word) and acoustic (emotional) data, improving human‑computer interaction.

Takeaways

Human language conveys precise meaning, yet a single sentence can hold multiple meanings depending on emphasis, context, and includes speech, writing, signing, and coded signals.
The left inferior frontal gyrus is essential for planning and controlling speech, and transcranial magnetic stimulation can temporarily disrupt speech by targeting this region, highlighting shared vocal‑learning mechanisms with songbirds.
Human speech lacks clear gaps, so listeners infer word boundaries from context, while computers segment audio and apply probabilistic models to guess words, making speech recognition a challenging problem.
Intonation, processed mainly by the right hemisphere, adds emotional nuance to words, and digital assistants like Ollie are beginning to combine semantic and acoustic cues to better understand human emotion.

Frequently Asked Questions

How does intonation influence speech comprehension?

Intonation provides a secondary layer that clarifies, emphasizes, or adds emotional context to literal words; the right hemisphere processes pitch, speed, and melody, allowing listeners to infer speaker intent beyond lexical content. This helps differentiate statements, questions, or sarcasm and improves overall understanding.

What brain region is critical for speech planning?

The left inferior frontal gyrus coordinates planning and control of human speech; stimulation of this area can temporarily disrupt spoken output, demonstrating its central role in vocal production and linking it to similar vocal‑learning circuits found in songbirds.

Who is The Royal Institution on YouTube?

The Royal Institution is a YouTube channel that publishes videos on a range of topics. Browse more summaries from this channel below.

Does this page include the full transcript of the video?

Yes, the full transcript for this video is available on this page. Click 'Show transcript' in the sidebar to read it.

Helpful resources related to this video

If you want to practice or explore the concepts discussed in the video, these commonly used tools may help.

Books On Linguistics And Human Language Recommended

Provides deeper insight into the structure, evolution, and cognitive mechanisms of human language discussed in the lecture.

Amazon →

Books On Animal Communication And Cognition

Explores the scientific research behind how animals like songbirds and primates communicate, expanding on the lecture's comparative analysis.

Amazon →

Books On Neuroscience Of Speech And Language

Explains the brain regions and neural pathways involved in speech production and processing, such as the inferior frontal gyrus.

Amazon →

Links may be affiliate links. We only include resources that are genuinely relevant to the topic.

Summarize another video

Full Transcript YouTube

[Music]
human language can be complex and
bewildering oh dear
sorry I've got to take this hello now
doing the Christmas lectures what she
said David didn't take his money
not what she said David didn't take his
money oh she said David didn't take his
money why didn't you just say that then
so now what you have there are three
very different meanings from the exactly
the same sentence will anything other
than another human being ever be able to
cope with that level of complexity in
this lecture I'm going to find out what
makes language the ultimate
communication tool and why humans are
absolute masters of it
[Music]
[Applause]
[Music]
[Applause]
[Music]
welcome to the third Royal Institution
Christmas lecture of 2017 I'm professor
Sophie Scott now humans have got an
incredibly powerful ability language I
can convey very precise meanings to
anyone within earshot if they speak my
language to give you a taste please let
me introduce scientist comedian and
rapper Alex Lethbridge
so I've listened to Doc Brown a caller
and syntax they showed me how to flow
off my grammar and syntax the ROI told
me Alex what's your language I checked
my head do you mean English fancy or
Spanish
now our PhD is crazy I'll do workplace
it pays me and when I get ball or the
language or two so why you're getting
PTSD from your GCSEs I'm wondering
should I RSVP GCHQ now I'm not sure
so he says languages complex you've got
rules like subjects verbs and objects
it's more than words you've got
intonation and context final lecture
we're learning all of these concepts
I don't about you when I'm listening to
rap music I like to count all the words
and I reckon in about 25 seconds they're
like she said about a hundred and ten
words ya got over about 15 ideas you
assigned about right amazing thank you
very much Alex
we're used to thinking of telepathy as a
science-fiction concept but Alex just
achieved the exact same result we share
the content of our minds our brains
whenever we want to speak or rap to
anyone don't worry I'm not going to rap
are we unique in having these skills or
will we one day have a full conversation
with another species when I was a little
girl I so wanted to be able to talk to
animals
will that ever happen and will computers
ever be able to fully get their
processors around our language well
enough to understand the joke tonight
I'm going to explore what makes language
so amazing and so very difficult but
what do we mean when we talk about
language languages can come in many
forms we can talk we can write we can
sign and I've got a very basic form of
language here anybody speak Morse
that was a cry for help now I don't
speak Maus beyond being able to do that
but basically you can think of language
like Morse code as being a message which
we're sending with a code and to make a
code the first thing you need to do is
to produce a signal that's got some kind
of structure now that means a signal
that's not just a random stream of
noises without any order nor can it be a
very simple pattern just repeated again
and again you need to have a capacity to
send information like the short and long
patterns of the Morse now humans do this
when we speak aloud we're using the
sounds of our voices that we use in our
language to express a code can we find
any signs of a similar kind of structure
in other animals voices and if so could
we crack their code and have a proper
conversation with them there are some
animals who are very good candidates for
being able to produce these sorts of
sounds with structured elements and
those are birds now these guys who are a
couple of
zebra finches and a couple of Canaries
they're songbirds songbirds when they're
babies they learn all the songs they're
going to sing when they're adults the
most impressive can learn over 1,000
different songs could these songs
contain coded information now I can hear
a couple of jeeps coming out of here but
I think we have a recording of one of
the Canaries can we listen to that it's
a beautiful sound but does it contain
enough structure in that signal that it
could be used to transmit a code well
I've got an example of the Canaries song
here and what I'm showing it to you as
is what's called a spectrogram now a
spectrogram is a way of looking at the
structure in a sound so what you have
along this direction is time so this is
the sound unfurling over time how it's
changing over time this direction we've
got frequency and that's roughly telling
you low-pitched sounds up to
high-pitched sounds and where the colors
are warmer and brighter that's where
there's more energy and we can see in
these individual elements these little
notes and we're seeing some quite
structured elements to this we've got a
similar sequence here and repeating
there and then these sequences of lower
and higher alternating notes now I need
to compare this with another kind of
voice so I would like a human volunteer
please can I have can I have you in the
middle there with the penguin thank you
very much
now what's your name roof I'm going to
ask you to come over here and say the
first two lines of Humpty Dumpty into my
computer okay I'll tell you when to go
you could just stand about there
brilliant and go now brilliant thank you
very much for exemplary
sohere's roots version of Humpty Dumpty
shown in the same way on a spectrogram
you can see immediately there are some
differences so there's the canary phase
Ruth you can also see some similarities
and I'm talking the most general sense
here but Ruth is producing individual
rhythm in the syllables of what she's
saying and you're seeing a pattern of
that over the sentence and you're seeing
something broadly comparable in the
canary we're seeing structure in those
sounds the canary and the speech sounds
have both got rhythm pitch rate
information in there so it's at least
possible that the songbird is producing
something which has got similarities to
the way we code information in our
speech could it actually be a code
though well probably not it doesn't seem
to be quite enough so if you look at how
songbirds use song what you find is they
don't generally change their songs once
they've learned them they will always
sing the same whole song and the other
thing they don't do is chop sums up and
rearrange them to make new songs we do
that all the time we can use words in
lots of different very novel orders the
birds don't do this so it's entirely
possible that complex though the
songbirds are they are not producing
something that is conveying a complex
coded meaning but there's another group
of birds that can learn to say human
coded signals use words and those are
parrots
please meet Mike and his parents
[Music]
hi hi Mike who have you brought with you
this is Helle and Elias and Amazon
parrot it's a South American bird from
the rainforests and she's already said
hello wasn't she she has yes she knows
that's an introduction so it's the first
point of call for a conversation or
attention gathering how many other
pilots does she use she uses about 80
different sounds and words and back yeah
it's a bit for human words no human
words I would say she was about five or
ten yeah can you say buying so she's
doing a set of humour did you teach her
those or did she just pick them up from
seeing how they were used yeah she she
understands that the hello is a greeting
because it's a word that we reduced on
arrival and she would then understand
that the bye is the departure word so
she places the words with timings as
well but throughout her life she's had
lots of trainers saying are you alright
are you alright so she actually gets
worried something like a studio
environment you will say are you alright
and she knows that goes with almost an
emotion as well so she makes these words
with timings and emotions and she can do
other sounds that aren't what you can do
more than words can't she yeah she she
uses lots of different sounds as well as
words can we have a bomb
they can also be copy in other birds as
well so local birds in the wild and also
other birds that we have thank you very
much Mike and thank you very much hello
lovely meeting you
what gives birds their amazing ability
to learn to produce these sounds well
the answer may lie within their brains
our birds are more closely related to
dinosaurs than they are to us but we
share similarities in the ways that our
brains control both the learning and the
production of sounds we make with our
voices and the genes that build those
parts of the brain if we look at a bird
brain we can identify specific areas
which are important in the learning of
song and in the control of singing and
if we look at a human brain we can see a
similarity in that there are specific
networks recruited when we are speaking
when we're talking these are human
brains here this is the right side of
the brain and the left side of the brain
we see some areas that are just strongly
associated with the control of all the
work we have to do to make the sounds of
speech we also find a very specific area
just on the left side of the brain which
seems to be very important in planning
speech and it's also important in
learning new things to do with our
voices how can I find out more about
this system we can take these snapshots
of the brain in action
and we can work with people who have had
strokes and who have damaged these brain
areas but there's another technique that
we can use where we can investigate what
would happen if we could turn off that
part of the brain in someone and just
that part of the brain what I'd like to
do is introduce you to my colleague from
UCL dr. Ricky Hanna and comedian Robin
intz
[Applause]
now Robin Robin what we're going to do
is sit you down here alright and then
I'm going to let Ricky explain what
we're going to do next okay I have to
emphasize this is a temporary state of
affairs okay we're not about to do some
sort of terrible live brain surgery on
you I was quite worried because I was
watching what both ago I thought they're
gonna prove that I'm less intelligent
than a parrot let's find out what
happens
Ricky can you tell us what we've got
here so this is a transcranial magnetic
stimulator and what we can use it for is
to probe how different parts of the
brain play a role in different aspects
of behavior in this case speech okay so
can we start just by pointing out that
the way that we get this we change
Robins brain activity by passing an
electrical current through that coil is
that right and actually according to the
principles of electro magnetic forces
which were first described here by
Michael Faraday well that lets us do is
induce currents inside Robins brain
without actually having to get inside
his brain it's absolutely amazing so can
I start by getting getting Robin talking
mm-hmm and then seeing if we can stop
him talking okay people have been trying
this for years okay so I just position
the coil what I want you to do is to say
the months of the year really loudly and
clearly okay when you're ready January
February March April
please start talking again that is it's
very that I I think Professor Brian Cox
or do radio show is gonna want to bear a
not I wear with that in there like you
but it just seems like it's like kind of
Homer with a donut you know the moon
if you can bear it we try again yeah
it's really I find it amazing is it's
real isn't it and then it just comes
back I think I prefer quiet me let's try
to see how far I can get in the
Jabberwock yes tell me when you want me
to win you already trust brillig and the
slithy Toves
[Laughter]
[Applause]
what what Ricky's very specifically
focusing on here is this part of the
brain it's called the inferior frontal
gyrus on the left and in humans it's
incredibly important for planning in
controlling speech if we move to a
slightly different area in within this
same network what we find is that Robin
will be able to talk absolutely fine
we are going to try that yeah sure okay
okay I didn't even look at the health
and safety forum for this I said this to
you I know I didn't look just do the
case you're fine you're safe okay when
you're ready
twice brillig and the slithy Toves did
gyre and gimble in the way board members
you were the borogroves a moment yeah I
think I can't thank you enough for being
prepared to come out in front of
everybody and have a strand zapped your
brain
thank you very much Robin thank you very
much that was amazing
so you could see how precise those the
effect of the transcranial magnetic
stimulation was we were only seeing
Robin stopping talking when we were
applying the TMS over his left inferior
frontal gyrus when we went elsewhere in
the brain other things happened but it's
not stopping him from talking so what
we're seeing here in the humans and in
the birds are very dedicated brain
regions that are important in vocal
control and vocal learning as a strong
hint that we really are seeing some
commonality in the brain areas that are
to do with the learning and the
producing of vocal sounds but can any of
these birds really understand the words
they're saying well parrots don't just
say human words they'll mimic pretty
much anything they hear a lot of car
alarms creaking doors alarm clocks why
do they do this at all well birds are
mimicking to show off to potential mates
to get attention to impress other birds
and other humans to defend their nesting
sites perhaps the more impressive a
sound they can make the more likely
they'll find a companion or scare off a
rival so birds aren't showing a great
ability to decode our language but what
do I mean by decoding we humans are
exceptionally good at working out what
words are and what words mean I'd like
you to watch and listen to a clip that
I'm about to play you there's going to
be a short test afterwards and this will
go down on your permanent record
back death gidge hawk fit not fib wreck
sit they fit dip Hawk the gap gin Pok
Pok learn zip map fit Ross Hawk yes Beth
okay short test what's this called
that it's a fit what's this
hoc excellent well done now I didn't
tell you to work out what was going on
there what you were doing was decoding
the information we gave you there were
some novel sounds in there and they were
being associated in quite a regular way
with visual information even if you
don't know that your brain is trying to
do it you're always trying to spot words
and work out what those words mean we
are not the only animal who has an
ability to do this there's an animal
you're probably all very familiar with
who's actually really very very good at
sharing this ability with us and that's
dogs please give us a very nice doggie
friendly round of applause for Gable and
his owner Sally so Sally what's
different about Gable Gabe has got the
ability to identify a large number of
objects and toys by name he currently
knows about 150 different names for toys
and objects and articles isn't it
now can we see a demonstration of that
if we got so we've got some of Gables
toys here and what we're going to do is
get a random selection of those out and
then see okay good I think there's
somebody hi Sean now we've spent a
little bit of time with Sean getting
Sean used to being with Gable and gable
used to being with Sean can I bring you
down thank you
[Applause]
can you pop these gloves on and can you
just randomly select 15 toys from these
buckets and spread them out when did you
first realize cable could do this when
he was a young puppy a chewy he sort of
invented the game I was trying to watch
tell you one evening and he kept
pestering me he wanted to do something
and I just remembered all yeah his red
toy was upstairs and I just don't go and
get you red toy sort of idly just
dismissing him hoping that he'd be gone
for ages because he couldn't find it and
he came back with it and he put it in
front of me and looked at me and I just
thought I've actually never told you
that's called red toy and so I just then
thought I wonder what happens if I do
teach in the name and it sort of gone
from there really
yeah now sure don't go anywhere
gonna need you again now can I ask you
Sally to tell gable to pick one of these
toys
okay please Gable Triceratops yet
Triceratops
[Applause]
so we've seen that gable is really very
good at working at what you're saying
I'm trying to work out from looking at
you which one you mean
but in gable is your dog and he might be
really really familiar with your voice
it would be very interesting to know if
we could see this happen with someone
who's got a very different voice so
Shawn is that okay Shawn has a go yeah
and what you need to do is just
whispering to Shawn's ear which toy
you'd like him to pick up
[Applause]
[Applause]
so this really does indicate the Gabel
must have some understanding of what
words mean above and beyond just
associating your voice with things
that's fantastic thank you so much Sean
thank you so much Sally and particularly
thank you so much gable so what's
happening in Gables brain so that he can
do this well as a group of scientists in
Hungary who've been doing some quite
extraordinary experiments they've been
training dogs to lie very still and
they've been putting them into brain
scanners the brain scanners don't work
if you move so you've got the dog has to
stay very still and then what they're
doing is they're taking pictures of the
activity inside the dog's brains while
they're listening to different sounds
and words when I scan people they're not
normally that happy so this is called
functional magnetic resonance imaging
and it lets us take photographs of the
brain in action and this is showing the
results so a dog's brain as you can see
it's different in shape to a human brain
but it has some of the same structures
in there and what quite strikingly one
of the things they're finding in the
results is when dogs hear words they
understand we see greater activation in
the left side of the brain particularly
in brain areas to do is processing sound
and that is very similar to what you
find in our brains so it looks like it's
at least possible that more than any
other animal we looked at dogs might be
sharing some of our ability to decode
spoken words and they may even be doing
it in similar ways to us but of course
amazing as they are dogs brains do have
their limits the largest number of words
that any dog has been found to
understand and this was an exceptional
dog is about a thousand words which
sounds fantastic
gable knows about 150 words but every
one of you could understand a thousand
words when you were three years old and
of course human language is more than
just single words we don't walk around
going steps camera man we're putting
words together into sequences and when
we put words into sequences we actually
add in an extra level of coded meaning
so perhaps if we want to look for some
more human-like ability to put symbols
into sequences like we do with sentences
we should be looking closer in the
evolutionary tree perhaps we should be
looking to our closest cousins other
Apes now because I'm not David
Attenborough I don't get to play with
the chimpanzee but we have the next best
thing
please release the Apes Neil and ace so
you guys have been working a lot with
our eight cousins we have yeah we
actually were very fortunate to work
with Andy Serkis on panda Apes last
frontier which is an interactive movie
so we were very lucky we got to study
apes for quite a while before we started
creating our own characters from a book
we watched their movements their
behavior patterns hierarchies are some
of the work that we did and also the
speech that he use which is five
different parts of language - grunts
barks hoots whimpers and screams which
they do use chimps and we use this as my
a Patsy Bryn that's you yeah it's
amazing so what this is letting you do
is have eight actors which are based on
humans so you can get them to do things
that you could never actually ask her
another right to do absolutely we get to
create a bit of drama in the family it's
actually really about family this this
interactive film so it's very centred on
this particular family of apes have
splintered away from Caesar so you've
had to work really closely with the Apes
to actually learn about their body
language and their movements
do you pick up on anything that felt
like communication that you were looking
at yeah we studied a lot of their
gesticulation and body language
especially between dominant and
subservient male Apes because what you
get is a lot of different behavior
patterns that were being formed by
simple a simple sari for instance the
touch of hands if somebody's in adult a
dominant position is quite important to
be able to so if you offering if you
were trying to get my attention and
forgiveness for instance they're
communicating with gestures they are
amazing thank you so much thank you
so nearly nice I'm just imitating the
Apes very well they are clearly picking
up on some aspects of communication but
does that say if they're really kind of
some kind of eight conversation going on
and would it look at all like the way we
use language to find out please welcome
chimpanzee researcher from the
University of San Andrews dr. cat ho
bata you've been asking some really
interesting questions about eight
language and can you tell us more about
your work
yeah absolutely so what people had done
in the past was try to teach Apes
our language so human language or sign
language or they'd looked at their
vocalizations but what we've been
looking at is their gestures okay and it
turns out that they've got a lot of
different gestures so their own natural
communication contains 60 or 70
different hand and body movements that
they're using every day to ask come here
go away I want that all the little
meanings and how do you work out what
those gestures means what we do is we
look for a particular gesture and then
we're looking for what happens next and
if I were to do this and you responded
back to me one or two cases it could be
anything or you could misunderstand me
or I could keep going but what I'm
looking for is what stops me from
signaling so if I'm asking you for
something then the thing that you do
that makes me happy as a signaler is the
thing that I wanted yeah
yes you have to get the whole context
yep so we need to look at the signal at
the recipient at the gesture and then we
need to look at not just one or two but
hundreds of cases you can see the
patterns emerging in the behavior and
you've been using the general public to
help you with your research as well
haven't you yep so we were able over a
few years to look at all of the other
Apes but there was one of them missing
which was us and we know we have
language but we don't know if we still
have access to some of the communication
that the Apes also sort of gestural
stuff exactly yeah so whether or not if
we showed
and a member of the public a particular
gesture could they guess what it meant
and what we did was put up lots of
videos online and ask everyone to come
along and sort of play the game and have
it going can we have a go at that now
yes please fantastic so I think we've
got a clip to start with we're gonna
watch this closely because we're gonna
ask you what you think's going on so the
gesture there was that kind of arm race
yeah okay and it's from that is that a
baby yeah she's just so do we think
that's a I want that B move closer or C
go away yep you guys are all well versed
in chimpanzee already that was move
closer
okay can we try another one okay we're
just highlighting this cuz it happens
very quickly yeah this is a really
subtle one so it's just that little foot
movement as she looks back and gives a
little foot wiggle to her son what
happens next is that play with me climb
on me or come here yeah it's not the
sweet that one very very Swift yeah
really subtle so this is really
fascinating it does look like you're
seeing a sign language you're seeing
really gestural use of communication
they're not just waving their arms
around passionately there is SPRO
precise meaning being conveyed here but
of course the real power of human
language is that we can combine our
words and symbols into sequences and
that gives us a more complex level of
meaning can the chimps ever do this do
you ever see any kind of examples of
sequences definitely so what we see
sometimes are these one-off gestures but
they will also put gestures together
sometimes a couple at once that was a
big scratch and a shaking the object and
he's actually waiting there and whatever
that gesture was it doesn't seem to have
done the trick
Asura har is ignoring him just gonna do
it girls gonna do it again and this my
arm went up so we've got a scratch a
tree wiggle and the arm going up yep and
it worked
all right then I'll get down she's
annoying so the tree waving is this sort
of
here I'm in charge who's scratching
scratching usually means groom
that's groom let's okay either in green
okay our big question for us now is when
they're putting these gestures into
sequences yeah if it was human words the
order of the sequence would make a
difference to the meaning yeah and we're
trying to work out now if that's the
same for the chimps amazing thank you so
much cat thank you that why is this so
exciting
well if other Apes can also combine
symbols into a specific order to produce
something with complex meaning it
unlocks a great deal more of the power
of the language combining symbols into
sequences allows us to describe the
world in much more complex ways it gives
us the power to share much more complex
ideas it's a really useful skill but it
is not in any way a simple skill and to
look at exactly what kind of thing the
chimps are up against in terms of the
difficulty of understanding sequences of
symbols and decoding them accurately I
need two volunteers can I have you there
in the Los Angeles can I have you in the
center so it Rudolph Rudolph fantastic
thank you very much thank you
thank you very much now what's your name
Gracie lovely to meet you Gracie and
what's your name Ryan Ryan fantastic
now Ryan and Gracie what we need to do
is put some just general covers on you
okay we're just going to sort it from
top to toe cover you up
thank you so what we're going to do is
I'm going to ask you one at a time to
read some instructions and to mix some
things together and Gracie I'm gonna ask
you to go first so Ryan so that you
don't see what's going on I need you to
pop on a blindfold and some ear
defenders okay just so that you don't
give away the game right so in a second
I'm going to step forward to eat with
you and we're going to turn over your
instructions an important thing to
remember when we get to the end of the
instructions is you step back with me
okay you're right
okay let's start
I think you - Italy you've got it you've
got it
good code-cracking and step back to see
an alarming change of color fantastic
Gracie okay now don't go anywhere
they're not going to turn to Ryan okay
so what we're going to do I'm going to
turn over your instructions in just a
second then I want you to follow them
with these chemicals here okay to be
okay and then you step back
[Applause]
thank you so much now we had two
different outcomes there we had a
dramatic change of color or an actual
foam explosion depending on the order in
which the chemicals were mixed and the
only reason why Gracie and Ryan mix them
in different orders was just how we'd
punctuated the instructions so they both
are the same instructions and they were
just understanding them differently
based on where we put the full stops
thank you very much Gracie thank you
very much Ryan Grammer in this case the
punctuation of the sentences has
completely changed the meaning of those
sentences and that's why this tiny
glimpse of the ability to combine
gestures in the Apes is so exciting so
if we think about the animals that we've
talked about this evening
we've seen birds who are able to produce
very complex sounds and they can mimic
many other sounds but they don't seem to
understand the meaning of those words
we've seen dogs now dogs don't talk
but dogs are incredibly good at working
out what human words mean and what they
refer to and we've seen the chimpanzees
they're both using gestures to have
quite precise meanings and they're also
using sequences of those gestures to
have more sentence like structures but
they're not the same as the kind of
codes humans use why are there these
distinctions between humans and other
animals well I think a big difference is
likely to do with our brains I just want
you to look at these different ones here
so here we've got a bird brain that's
actually a chicken brain that's a dog's
brain this is a replica chimpanzee brain
and this is a human brain now they look
different one obvious difference is size
and the human brain is enormous much
larger than any of the other brains on
the table but big brains aren't
everything to make an analogy to
computers it's not perhaps just about
the raw processing power of the brain
maybe we need to think about the
operating system as well and we seem to
have a very efficient
operating system for dealing with symbol
based codes I always wanted the
superpower of being able to talk to
animals but maybe our mismatching brains
mean that I never will
maybe there's something else in our
world with which we communicate far more
frequently and maybe I need to stop
thinking so much about how I could have
conversations with other animals and
maybe think a bit more about talking to
machines this year's big Christmas
present gadgets the machines that we can
talk to digital assistants a quarter of
us now either command our phones via our
voice or use devices like these around
the home hey computer please tell me
what the weather will be like in London
tomorrow there will be showers there
tomorrow with a high of 9 and a low of 3
ok now that's nowhere near being
anything like a human brain but when I
was doing my PhD on speech processing 25
years ago I would have never believed
that one day we would be walking around
with phones in our pockets that we could
talk to it's incredibly hard to
overestimate how quickly this field has
developed what's happening inside that
box how our computers managing to
interact with us and if they're ever
having a proper conversation with us we
ever going to have a meaningful dialogue
with a computer will it ever understand
the jokes we tell now this is an
incredibly complex area and we could
easily fill 3 lectures just on this
topic so I'm going to give you a tiny
insight into aspects of how this works
but first let's break down the question
what the computers need to do to be able
to understand us and the first thing is
it needs to take the sounds that we make
and decode that into words this is
called speech recognition and it's what
is known in science as a ridiculously
difficult question and that's because
speech is hard and speech is complex
I'm going to play you a sentence spoken
in Estonian and what I want you to do is
count the words you can hear ok so just
listen out for words and see how many
there are you spike up getting Neela
Martin go Safari old rule does autumn
going quick misty Horeb see Lily
Marbella on
then you my colleague not cell ok how
many how many words did you think with
her sorry 13 that's a good guess
pink top glasses 32 okay we've been a
big jump they're about twice as many
okay
let's see the actual sentences so we've
got 22 words so actually it's a
meaningless question to ask you because
if you don't speak Estonian why would
the words stand out to you but this is
what the computer is confronted with
it's hearing this continuous flow of
sound and it's got to find some way of
getting a toehold on what those words
can be now listen to this sentence what
I'm going to do is record myself
speaking where are the words so this is
the spectrogram of me saying where are
the words that are just recorded but
where are the words there's no
individual words there at all
what you're getting is a sort of smear
of energy and what looks like one thing
happening at the end though is actually
the sound at the end of words when you
hear gaps between words when someone's
talking to you that's because you
understand those words if you don't
understand the words you don't hear the
gaps as we saw with the Estonian so how
do computers deal with this how the
computers find the starts and ends of
words when there actually are no
physical gaps there necessarily at all
well the first thing a computer does is
it breaks up the incoming stream of
sound into smaller chunks of sound and
we've got a demonstration of this here
so this is a incoming sentence and I
don't know what this is I've just got
the spectrogram to work off and I'm a
computer for the purposes at this point
and what I'm going to do is split this
up into different slices of information
and what I'm going to do then is take
those different slices and goatees and
look them up in a library that I've got
which will help me try and get the best
estimate of what speech sound I'm
probably trying to look at and my
library has to be very as I sort of have
an idea
version of speech chance because we all
talk differently so instead of trying to
find an exact match for the speech sound
the computer is looking for almost like
a best guess so if we do this with the
first speech sound in my sequence I've
got a little slice of sound here now if
I look at that in isolation what I can
see is that there's a broad sort of
smear of energy there there's no bright
hotspots remember this when you get a
more intense bit of energy in a speech
sound or any sound in a spectrogram you
see it as a brighter color
there aren't any really bright colors
there but there's a big sort of stretch
of red so it's going almost the whole
length if I go over here I think I might
be looking at saty so my best guess for
that first sound is that it's T now who
would like to help me guess the next
sound can I have you in the blue top
please thank you very much sorry right
Joe you're gonna help me be a computer
processor and work out when the next
sound is we've got one here now Joe you
can see this looks quite different to
that sound that we were just looking for
now can we find anything in our library
that looks like that yeah I think you're
right I think we're dealing with e can
you put that there for me now not to
labor the point if we carry on like this
you're not getting like really wit quick
recognition of the speech are you but
we're taking our time so what I'm going
to do is throw more processes at the
problem just as a computer would I'm
going to take a couple more volunteers
please so can I have you the glasses
please and can I have you the polo-neck
sweater thank you very much yeah can I
have you thank you very much
gotta have a uniform
okay so unicorn what's your name
Sasha lovely to meet you Sasha hi Evie
and your name is kit excellent now
there's four speech sounds left I want
you to each grab one and see if you can
match it up with any of the speech
sounds here okay there you go and what
have you got here look at this one can
you see any that we've got sloping
shapes I think you might be right there
so if you can remember you came from
number four I think I think you're
absolutely right you grab an S sound and
pop it up now you have got the really
difficult one here because when the
speakers said it they've really under
articulated it so what you have is this
sound here and that's going to have to
be our best guess okay so if you can pop
that in we can see what we've got
thank you very much well it could be
team mates
it could be tea mate we don't know still
what the words are but we now got a good
guess what our speech sounds are and
that takes us to our next stage what I
want to do first you say thank you very
much so Joe Sasha Evie kit thank you
very much thank you so the addition of
processing power has really helped us be
able to speed up this process and that's
why you can talk to your phone without
it taking about that amount of time and
what we're doing here is pulling out
what the speech sounds are but we still
don't know what those words are we don't
know where the edges are what we need to
do is go to another level and what
computers do is go to what's called a
language level to start to break the
stream of sounds up into words and into
sentences I need two new volunteers to
decode a stream of speech me just like
you were a computer can I have you in
the middle over the blue t-shirt yes
there you go fantastic can I have can I
have you thank you very much thank you
now what's your name
that's lovely to meet you max and your
name is kami fantastic well I'm going to
give you is the same task of computers
got when it's worked out their speech
sounds but it doesn't know what the
words are you what you're going to do is
see a list of speech sounds and I want
you to try and work out what words could
be in there sometimes the computer
doesn't know what a sound is at all as
you'll just see a question mark maybe
there was a cough maybe there was a
noise so sometimes you're going to have
to guess okay so I can just ask you to
step over here max thank you very much
okay we'll see it appear on the screen
okay so reading down in that direction
try reading that aloud now think about
the building that we're in Tammy the
Royal Institution absolutely so context
can help you go back to that we are at
the Royal Institution one last time
drums institution well done
thank you Thank You Cameron Thank You
max thank you thank you so that's an
extreme example but what you're seeing
there is essentially the same problem
the computer is trying to solve it's
trying to find the edges it's trying to
work out where the words could be and
it's helped in this because it knows
what the words are it has a database of
thousands and thousands of words and it
also knows something about how sentences
go together it knows how probable it is
that a word will follow another word and
if a word becomes more likely the
probability of it occurring actually
changes in the computer so it's more
easily activated and in fact our brains
do something similar if you're listening
to speech and you hear something which
is highly predictable like the ship
sailed across the bay then you will
understand that sentence more easily so
we're seeing actually in terms of lot of
the code of language humans and
computers are not as differently matched
as we used to be humans our computers
are catching up with a lot of our
linguistic abilities computers have
become huge and very powerful they can
be searching through very large
databases very quickly and the internet
means a little box like that can be
connected to those large online
databases it doesn't need to have all
the information there but of course this
is not the full story of how human
language works there is an entire level
of communication that we use all the
time that we have barely mentioned
rather than just what we say when we're
talking we're always sending out
information in how we say words
everything I've talked about so far has
been if we think about brains associated
with properties at the left side of the
brain the left side of the brain in
humans for most people is associated
with how we decode speech how we control
our own voices and the right side of the
brain is a lot less interested in these
linguistic properties of communication
like words and sentences and a lot more
interested in all the other things that
are going on when we're talking to
people who we're talking to are they
being emotional are they telling a
brilliant joke like what is
round and sounds like a trumpet a
crumpet so to have a meaningful
conversation with a machine or
understand my brilliant jokes we need to
do something much more difficult we need
to add in to our computer
the missing right half of its brain
because actually when we're talking when
you're listening to somebody there are
very important aspects of our
communication which you need to pick up
on to really understand what somebody's
saying not just when it's written down
the information when someone's speaking
is very often being expressed as much by
how they're talking as what they're
saying and this often refers to
something called intonation intonation
is how we vary the pitch the speed the
melody of our voice when we're speaking
an example would be that we don't talk
to each other like that we would
consider it to be quite strange we're
always using intonation to clarify
enhance putting emphasis and emotion in
our brains intonation is processed very
differently from the words that you're
listening to on the whole intonation is
strongly found to be something that the
right hemisphere deals with and is
interested in and often it can be as
important if not more important to the
real meaning of what somebody's saying
we've got some recordings here of
somebody speaking in an emotional style
and we've stripped out all the auditory
information that tells you about the
words they're saying and it's just
leaving you with the intonation see if
you can guess the emotion any guesses
that sound happy angry sounds annoyed
doesn't it yeah that was angry that was
very angry another one
yeah lots of downwards inflections
sounds softer that was a sad voice so
that you're definitely using intonation
to get aspects of emotion out but that's
by no means the whole story and to help
me demonstrate why intonation is so very
important to communication
please welcome news presenter and
journalist Krishnan guru-murthy Krishnan
obviously the words that you say are
absolutely critical to your job but how
you're saying them must be something
that you're always thinking about we use
intonation to do all sorts of things on
the news all the time and it's very very
complex I mean at the beginning of the
news we're saying this is urgent it's
exciting you've got to watch something
really important happen today and I've
got to tell you about it yeah and then
once we get into the news we're trying
to use the the right intonation for the
type of story where we're telling people
so we've got to try and hit the right
note literally so if it's something very
surprising the new President of the
United States is Donald Trump that's
quite surprising if it is very serious
or frightening news something terrible
has happened the new President of the
United States is Donald Trump you know
it's it's it's a different thing
now you wouldn't normally do that but if
you were he would sort of talking about
an attack or a war or something like
that this is this is a very serious
thing and you're trying to get people in
the right frame of mind but you're also
trying not to frighten them and you've
got to use intonation for that now we've
got a very short experiment to do with
krishnan who's going to read out some
football results just by the intonation
on his voice I want you to he's gonna
stop before he gets the final score and
we're gonna work out if the last team
had a higher score or a lower score than
the first team okay should we try one
Manchester City to West Bromwich Albion
nil nil there you go lower West Ham
United one Tottenham Hotspur
hi-yah fine Chelsea 1 Liverpool what
exactly a draw was so this is absolutely
amazing what Christians doing is over
the whole course of a sentence he's
doing a kind of dance with his
intonation that it's completely keeping
pace with the meaning of what he's
saying
you've got weaving the intonation in and
out of the words and you're picking up
on that you know what that means
can we can you do an example of it
incorrectly so we can hear what that
sounds like
Stoke City for Huddersfield v and when
you say that it sounds like it's the
words that are wrong
well is this the information that's
wrong it's really striking thank you
very very much Krishnan thank you thank
you so so we're using intonation all the
time in regular conversation and
sometimes we think that that's all we do
to pick up on emotion in the voice
actually emotion in the voice is
incredibly complex there's a great deal
going on because emotions change how
your bodies work and how they feel and
they can change many different aspects
of your voice up to now computers have
really struggled with this kind of
information but that's changing and
people are starting to make more
progress for our last demonstration I'm
going to look at something really
amazing a computer that can read
emotions from human voices now the
acoustics in here are quite reverberant
and this computer has been built to work
in the home and so what I'm going to do
is just step outside where there's a
lovely carpet and it's a little bit more
like being in someone's living room
let's go and meet Ali hi he's f pi Roman
time
now tell me about ollie so all you can
actually understand what you're saying
as well as the emotions of your voice
the tone of your voice
excellent what kind of thing do you do
with Ollie Ollie actually enhances the
communication between human and
technology so he could be a digital
assistant who's
understanding how you feel not just what
you're saying can we find out more about
how he works
sure it's it easiest to demonstrate that
yes of course so what you need to do is
to stand in a sensible distance from
ollie speak normal as usual and then
maybe you can add some emotion and
emotion to it okay right so are you
ready one two three hey Olli what's the
weather like in London today oh there we
go
yeah so he's only getting the emotion
out of your voice by the sound of your
voice or the words that you're saying or
is it both actually it's both it
recognizes some acoustic components of
what you said it's also using the syntax
and semantics of the words of the years
it's very clever thank you very much
thank you well he's doing an amazing job
of decoding emotion as well as words
from the human voice and you can see
from this just how hard emotion is to
read and how extremely nuanced it can be
we find it very easy because we've grown
up around people using language emotion
social meaning in their voices all the
time so it seems easy to us but it's
absolutely vital to understanding
language all around the world animals
are communicating in very simple and
very complex ways and we have so much
more to learn I've touched on just the
surface of animal communication today
within this humans do seem to be
extraordinarily complex in their
linguistic abilities but we're only
really using language with each other
could there ever be another life form
that could crack our code perhaps when
we haven't even encountered yet watching
Carl Sagan describe his work on the
Voyager space probes forty years ago in
the 1977 Christmas lectures literally
set me on course to standing here today
that's why I became a scientist the
voyages contain golden records and the
records contain the sounds of Earth
including greetings in 55 different
human languages
assalamualaikum how's the moon Kiran me
well lookie good ah see now the sounds
of the Earth's languages have long since
left our solar system a Voyager probes
are now 13 billion miles away from Earth
they're the most distant object ever
created by humans now if those
spacecraft ever encounter an alien
life-form maybe they can use these
sounds to start to decode our language
and if they have brains that work like
ours maybe one day we could have
conversations with them this year's
lectures have explored our fundamental
urge to communicate we've looked at
where it comes from and why we are so
very good at it and we just can't help
but reach out to others the big question
is whether there is another form of life
out there that could crack our code and
I hope you've realized that it will need
to have an incredible brain at least as
incredible as your brains to do so thank
you
[Applause]
[Music]
[Applause]
[Music]
[Applause]
[Music]
[Applause]
[Music]