An introduction to the biology and neurophysiology of human speech. The target audience is researchers and engineers working on speech recognition technology.
2. References
Audition, the body senses, and the chemical senses.
Physiology of behavior, 6th Ed, 1998, pp. 185-223.
by Carlson N. R.
Human communication.
Physiology of behavior, 6th Ed, 1998, pp. 477-508.
by Carlson, N. R.
FUNCTIONAL MRI OF LANGUAGE: New Approaches to Understanding the
Cortical Organization of Semantic Processing
Annu. Rev. Neurosci., (2002), pp. 151-188.
by Bookheimer, S.
Lateralization of auditory language functions: A dynamic dual pathway model
Brain and Language, 89 (2004) 267–276
by Friederici, A.D. and Alter, K.
3. Outline
● Auditory apparatus
● MFCC
● Lesion study
● Neuroimaging
● Dynamic dual channel model
● Can we design ASR systems by mimicking
organic systems?
12. MFCC
● Mel Frequency Cepstral Coefficient
– Take the Fourier transform of a signal
– Map the log amplitudes of the spectrum obtained
above onto the mel scale, using triangular
overlapping windows.
– Take the Discrete Cosine Transform of the list of
mel log-amplitudes, as if it were a signal.
– The MFCCs are the amplitudes of the resulting
spectrum.
13. From the ears to the brain
● Ear
– Spectral signals.
– Fourier transform done by neural circuits.
● Brain
– Two pathways in two hemisphere
– Left: semantics and syntactics
– Right: prosody
14. Brain Mechanisms for Language
● From lesion study to neuroimaging
● Localization of functions
● Lateralization
● Speech Production and Comprehension
● Prosody
15. Lesion Studies
● Aphasia
– Difficulty in producing or comprehending speech
caused by brain damage.
● Broca's aphasia
– agrammatism
– anomia
● Wernicke's aphasia
– poor speech comprehension
16. Broca's Aphasia
● Agrammatism:
– difficulty in understanding / using grammar
● Anomia:
– difficulty in finding the appropriate word to describe
an object, action, or attribute.
● Apraxia of speech:
– impairment in the ability to program movements of
the tongue, lips, and throat required to produce the
proper sequence of speech sounds.
17. Broca's Aphasia Example
● "Yes ... Monday ... Dad, and Dad ... hospital,
and ... Wednesday, Wednesday, nine o'clock
and ... Thursday, ten o'clock ... doctors, two,
two ... doctors and ... teeth, yah."
● 是...阿...星期一...阿...父親及父親....阿...醫院...及
阿...星期三...星期三九點... 以及 ,喔...星期四...十
點, 阿,醫生...兩個...醫生...及阿...牙齒...對的。
19. Wernicke's Aphasia
● Poor speech comprehension:
–
● Fluent but meaningless speech:
–
● Pure word deafness:
– The ability to hear, to speak, and to read and write
without being able to comprehend the meaning of
speech.
20. Wernicke's Aphasia Example
● Examiner: What kind of work have you done?
● Patient: We, the kids, all of us, and I, we were working for a long time
in the ... you know ... it's the kind of space, I mean place rear to the
spedawn ...
● Examiner: Excuse me, but I wanted to know what work you have
been doing.
● Patient: If you had said that, we had said that, poomer, near the
fortunate, porpunate, tamppoo, all around the fourth of martz. Oh, I
get all confused.
25. Semantic Conditions
● Same
– The lawyer questioned the witness.
– The attorney questioned the witness.
● Different
– The man was attacked by the doberman.
– The man was attacked by the pitbull.
26. Syntactic Conditions
● Same
– The policeman arrested the thief.
– The thief was arrested by the policeman.
● Different
– The teacher was outsmarted by the student.
– The teacher outsmarted the student.
27. Summary by Bookheimer, 2002
● The role of the left inferior frontal lobe in semantic
processing and dissociations from other frontal lobe
language functions.
● The organization of categories of objects and
concepts in the temporal lobe.
● The role of the right hemisphere in comprehending
contextual and figurative meaning.
28. Overview by Ahrens, 2007
● Past
– Functional localization (brain damage)
● Present
– Narrower localization + discussion of overlap and
integration (neuro-imaging techniques)
● Future
– Language as a brain function (integrate knowledge
about timing, context, and individual differences)
29. The Three Myths
● Myth 1: Broca’s area deals with syntax/production
– Fact: Semantics and phonology cluster in different areas of
the IFG; syntax seems to be distributed throughout the IFG.
– Fact: IFG is activated during non-language tasks.
● Myth 2: Wernicke’s area deals with
semantics/comprehension
– Fact: There are functional subdivisions for language in
posterial temporal area.
30. The Three Myths
● Myth 3: The right hemisphere is not used when
processing language
– Fact: The right hemisphere is called upon for many
integrative language processes.
> Figurative Language and Metaphor
> Linguistic Context
> Prosody
32. Dynamic Dual Pathway Model
● Spoken language comprehension requires the
coordination of different subprocesses in time.
● Segmental information:
– phonemes, syntactic elements and lexical-semantic
elements.
● Suprasegmental information:
– accentuation and intonational phrases, i.e., prosody.
33. Localization of Different Subsystems
● Segmental information:
– syntactic and semantic information are primarily
processed in a left hemispheric temporo-frontal
pathway including separate circuits for syntactic and
semantic information
● Suprasegmental information:
– sentence level prosody is processed in a right
hemispheric temporo-frontal pathway.
35. Can we design ASR systems
by imitating the brain?
● An open question
– Is it possible? Is it more effective?
● Complexity
– Basic computation power of a neuron: 60 hz
– 10^8 of input, 10^10 in the brain, each with >8000
connections
● Training time
– How long would it take for a human being to
understand language?