SlideShare a Scribd company logo
1 of 41
Performance analysis of Bangla
Speech Recognizer model using
Hidden Markov Model (HMM)
Submitted by:
Md. Abdullah-al-MAMUN
1
OUTLINEOUTLINE
 What is speech recognition ?What is speech recognition ?
 The Structure of ASRThe Structure of ASR
 Speech DatabaseSpeech Database
 Feature ExtractionFeature Extraction
 Hidden Markov ModelHidden Markov Model
 Forward algorithmForward algorithm
 Backward algorithmBackward algorithm
 Viterbi algorithmViterbi algorithm
 Training & RecogntionTraining & Recogntion
 ResultResult
 ConclusionsConclusions
 ReferencesReferences
2
What isWhat is SSpeechpeech RRecognitionecognition??
 In Computer Science, In Computer Science, Speech recognitionSpeech recognition is is
the translation of spoken words into text .the translation of spoken words into text .
 Process of converting acoustic signal capturedProcess of converting acoustic signal captured
by microphone to a set of words.by microphone to a set of words.
 Speech recognition known as “AutomaticSpeech recognition known as “Automatic
Speech Recognition (ASR) ”, “Speech to TextSpeech Recognition (ASR) ”, “Speech to Text
(STT)".(STT)".
3
Model ofModel of BBanglaangla SSpeechpeech
RRecognitionecognition
4
Fig -1 : Simple model of Bangla Speech Recognition
Database Signal
Interface
Feature
Extraction
Recognition
Databases
Training HMM
The Structure ofThe Structure of ASRASR System:System:
Figure 1 :Functional Scheme of an ASR SystemFigure 1 :Functional Scheme of an ASR System
Speech
samples
X Y
S
W*
5
Speech Database:Speech Database:
-A speech database is a collection ofA speech database is a collection of
recorded speech accessible on a computerrecorded speech accessible on a computer
and supported with the necessaryand supported with the necessary
transcriptions.transcriptions.
-The databases collect the observationsThe databases collect the observations
required for parameter estimations.required for parameter estimations.
-In this ASR system, I have used aboutIn this ASR system, I have used about
1200 keywords.1200 keywords.
6
Classification of KeywordsClassification of Keywords
Bengal Word
Independent Dependent
Vowel Consonant
Modifier
Character
Compound
Character
7
DDatabaseatabase CCreationreation PProcessrocess
Database
8
Speech Signal AnalysisSpeech Signal Analysis
Feature Extraction for ASR:Feature Extraction for ASR:
- The aim is to extract the voice features to- The aim is to extract the voice features to
distinguish different phonemes of a language.distinguish different phonemes of a language.
9
5
1
5
6
4
5
4
6
5
1
5
6
1
5
6
1
6
5
1
5
6
4
5
6
4
5
4
2
5
1
5
6
1
5
6
5
Feature
Extraction
MFCCMFCC extractionextraction
Pre-emphasis DFT
Mel filter
banks
Log(||2
) IDFT
Speech
signal
x(n)
WINDOW
x’
(n)
xt (n)
Xt(k)
Yt(m)
MFCC
yt
(m)
(k)
10
MFCC means Mel-frequency cepstral coefficients that
representation of the short-term power spectrum of a sound for
audio processing.
The MFCCs are the amplitudes of the resulting spectrum.
Speech waveform of aSpeech waveform of a
phoneme “ae”phoneme “ae”
After pre-emphasis andAfter pre-emphasis and
Hamming windowingHamming windowing
Power spectrumPower spectrum MFCCMFCC
Explanatory ExampleExplanatory Example
11
FFeatureeature VVector toector to P(O|M)P(O|M) viavia
HMMHMM
12
5
1
5
6
4
6
5
4
5
6
4
P(O|M)HMM
For each input word O the HMM generate a corresponding
probability P(O|M) that could be computed by the HMM.
HMM ModelHMM Model
13
HMM is specified by a five-tuples =λ ( , , , , )S O A BΠ
14
Elements of an HMMElements of an HMM
1) Set of hidden states1) Set of hidden states S={1.2., … … N}S={1.2., … … N}
2) Set of observation symbols2) Set of observation symbols O={oO={o11, o, o22, … … o, … … oMM}}
M: the number of observation symbolsM: the number of observation symbols
3) The initial state distribution3) The initial state distribution
4) State transition probability distribution4) State transition probability distribution
5) Observation symbol probability distribution in state j5) Observation symbol probability distribution in state j
1{ } ( | ), 1 ,ij ij t tA a a P s j s i i j N−= = = = ≤ ≤
{ ( )} ( ) ( | ) 1 ,1j j t k tB b k b k P X o s j j N k M= = = = ≤ ≤ ≤ ≤
0{ } ( ) 1i i P s i i Nπ π π= = = ≤ ≤
15
Three Basic Problems in HMMThree Basic Problems in HMM
 1.The Evaluation Problem1.The Evaluation Problem –Given a model–Given a model λλ =(A, B, π)=(A, B, π) and aand a
sequence of observations Osequence of observations O = (o= (o11, o, o22, o, o33,...o,...oMM )), what is the, what is the
probability P(O|probability P(O|λλ); i.e., the probability of the model that); i.e., the probability of the model that
generates the observations?generates the observations?
 2.The Decoding Problem2.The Decoding Problem – Given a model– Given a model λλ =(A, B, π)=(A, B, π) and aand a
sequence of observation Osequence of observation O = (o= (o11, o, o22, o, o33,...o,...oMM )), what is the, what is the
most likely state sequence in the model that produces themost likely state sequence in the model that produces the
observations?observations?
 3.The Learning Problem3.The Learning Problem –Given a model–Given a model λλ =(A, B, π)=(A, B, π) and aand a
set of observationsset of observations O = (oO = (o11, o, o22, o, o33,...o,...oMM )), how can we adjust, how can we adjust
the model parameterthe model parameter λλ to maximize the joint probabilityto maximize the joint probability
P(O|P(O|λλ)?)?
How to evaluate an HMM?
Forward Algorithm
How to Decode an HMM?
Viterbi Algorithm
How to Train an HMM?
Baum-Welch Algorithm
16
CalculateCalculate
PProbabilityrobability ( O| M )( O| M )
Trellis:
0.5
0.3
0.2
P(up)
P(down)
P(no-change)
0.3
0.3
0.4
0.7
0.1
0.2
0.1
0.6
0.3
0.179
0.036
0.008
0.35
0.02
0.09
0.35*0.2*0.3
0.02*0.5*0.7
0.09*0.4*0.7
0.02*0.2*0.3
0.09*0.5*0.3
0.35*0.6*0.7 0.179*0.6*0.7
0.008*0.5*0.7
0.036*0.4*0.7
0.6
0.5
0.4
0.2
0.3
0.1
0.2
0.2 transition matrix
0.5
0.2230.46
add probabilities !
Forward Calculations – OverviewForward Calculations – Overview
S0
S1
S2
S1
S2
S1
S2
π1
π2
a12=0.3
a11=0.7
a22=0.5
a21=0.5
TIME 2 TIME 3 TIME 4
0.6
0.1
0.3
0.1
0.1
0.2
17
Forward Calculations (t=2)Forward Calculations (t=2)
S0
S1
S2
S1
S2
π1
π2
a12=0.3
a11=0.7
a22=0.5
a21=0.5
TIME 2
NOTE: that α1 (2)+ α2 (2)
is the likelihood of the
observation.
1
2
1 1 13 11 2 23 21
2 1 13 12 2 23 22
(1) 1
(1) 0
(2) (1) (1) 0.21
(2) (1) (1) 0.09
b a b a
b a b a
α
α
α α α
α α α
=
=
= + =
= + =
0.6
0.1
0.3
0.1
0.1
0.2
18
Forward Calculations (t=3)Forward Calculations (t=3)
S0
S1
S2
S1
S2
S1
S2
π1
π2
a12=0.3
a11=0.7
a22=0.5
a21=0.5
TIME 2 TIME 3 TIME 4
α1(3)
0.6
0.1
0.3
0.1
0.1
0.2
19
Forward Calculations (t=4)Forward Calculations (t=4)
S0
S1
S2
S1
S2
S1
S2
π1
π2
a12=0.3
a11=0.7
a22=0.5
a21=0.5
TIME 2 TIME 3 TIME 4
S1
S2
0.6
0.1
0.3
0.1
0.1
0.2
20
Forward Calculation ofForward Calculation of
Likelihood FunctionLikelihood Function
t=1 t=2 t=3 t=4
α1(t) 1.0
π1 =1
0.21
α1(1) a11 b13
+α2(1) a21 b23
0.0462
α1(2)a11 b12
+α2(2)a21 b12
0.021294
α2(t) 0.0
π2 =0
0.09
α1(1) a12 b13
+α2(1) a22 b23
0.0378 0.010206
L(t)
p(K1… Kt)
1.0
α1(1) +α2(1)
0.3
α1(2) +α2(2)
0.084
α1(3) +α2(3)
0.0315
α1(4) +α2(4)
21
Backward Calculations – OverviewBackward Calculations – Overview
S0
S1
S2
S1
S2
S1
S2
π1
π2
a12=0.3
a11=0.7
a22=0.5
a21=0.5
TIME 2 TIME 3 TIME 4
0.6
0.1
0.3
0.1
0.1
0.2
22
Backward Calculations (t=3)Backward Calculations (t=3)
S1
S2
TIME 3
0.6
0.1
0.3
0.1
0.1
0.2
23
Backward Calculations (t=2)Backward Calculations (t=2)
S1
S2
S1
S2
TIME 2 TIME 3 TIME 4
a22=0.5
a11=0.7
a12=0.3
a21=0.5
NOTE: that β1 (2)+ β2 (2)
is the likelihood the
observation/word sequence.
1
2
1
2
1 1 11 12 2 12 12
2 1 21 22 2 22 22
(4) 1
(4) 1
(3) 0.6
(3) 0.1
(2) (3) (3) 0.045
(2) (3) (3) 0.245
a b a b
a b a b
β
β
β
β
β β β
β β β
=
=
=
=
= + =
= + =
0.6
0.1
0.3
0.1
0.1
0.2
24
Backward Calculations (t=1)Backward Calculations (t=1)
S0
S1
S2
S1
S2
S1
S2
π1
π2
a12=0.3
a11=0.7
a21=0.5
TIME 2 TIME 3 TIME 4
0.6
0.1
0.3
0.1
0.1
0.2
25
Backward Calculation ofBackward Calculation of
Likelihood FunctionLikelihood Function
t=1 t=2 t=3 t=4
β1(t) 0.0315 0.045
a11b11 β1(1) +
+ a12b21 β1(1)
0.6
b11
1
β2(t) 0.029 0.245
a11b11 β1(1) +
+ a12b21 β1(1)
0.1
b21
1
L(t)
p(Kt… KT)
0.0315
π1 β1(1) +
π2 β2(1)
0.290
β1(2) +β2(2)
0.7
β1(3) + β2(3)
1
26
27
CalculateCalculate
maxmaxSS Prob.Prob. state sequencestate sequence SS
0.35
0.09
0.02
P(up)
P(down)
P(no-change)
0.3
0.3
0.4
0.7
0.1
0.2
0.1
0.6
0.3
0.147
0.021
0.007
0.35*0.2*0.3
0.02*0.5*0.7
0.09*0.4*0.7
0.02*0.2*0.3
0.09*0.5*0.3
0.35*0.6*0.7 0.147*0.6*0.7
0.007*0.5*0.7
0.021*0.4*0.7
0.5
0.2
0.3
best
Select highest probability !
Viterbi Algorithm – OverviewViterbi Algorithm – Overview
S0
S1
S2
S1
S2
S1
S2
π1
π2
a12=0.3
a11=0.7
a22=0.5
a21=0.5
TIME 2 TIME 3 TIME 4
0.6
0.1
0.3
0.1
0.1
0.2
28
Viterbi Algorithm (Forward Calculations t=2)Viterbi Algorithm (Forward Calculations t=2)
S0
S1
S2
S1
S2
π1=1
π2=0
a12=0.3
a11=0.7
a22=0.5
a21=0.5
TIME 2
1 1
2 2
1 1 13 11 2 23 21
2 1 13 12 2 23 22
1
2
(1) 1
(1) 0
(2) max{ (1) , (1) } 0.21
(2) max{ (1) , (1) } 0.09
(2) 1
(2) 1
b a b a
b a b a
δ π
δ π
δ δ δ
δ δ δ
ψ
ψ
= =
= =
= =
= =
=
=
0.6
0.1
0.3
0.1
0.1
0.2
29
Viterbi Algorithm (Backtracking t=2)Viterbi Algorithm (Backtracking t=2)
S0
S1
S2
S1
S2
π1=1
π2=0
a12=0.3
a11=0.7
a22=0.5
a21=0.5
TIME 2
1 1
2 2
1 1 13 11 2 23 21
2 1 13 12 2 23 22
1
2
(1) 1
(1) 0
(2) max{ (1) , (1) } 0.21
(2) max{ (1) , (1) } 0.09
(2) 1
(2) 1
b a b a
b a b a
δ π
δ π
δ δ δ
δ δ δ
ψ
ψ
= =
= =
= =
= =
=
=
0.6
0.1
0.3
0.1
0.1
0.2
30
Viterbi Algorithm (Forward Calculations)Viterbi Algorithm (Forward Calculations)
S0
S1
S2
S1
S2
S1
S2
π1
π2
a12=0.3
a11=0.7
a22=0.5
a21=0.5
TIME 2 TIME 3 TIME 4
0.6
0.1
0.3
0.1
0.1
0.2
31
Viterbi Algorithm (backtracking)Viterbi Algorithm (backtracking)
S0
S1
S2
S1
S2
S1
S2
π1
π2
a12=0.3
a11=0.7
a22=0.5
a21=0.5
TIME 2 TIME 3 TIME 4
0.6
0.1
0.3
0.1
0.1
0.2
32
Viterbi Algorithm (Forward Calculations t=4)Viterbi Algorithm (Forward Calculations t=4)
S0
S1
S2
S1
S2
S1
S2
π1
π2
a12=0.3
a11=0.7
a22=0.5
a21=0.5
TIME 2 TIME 3 TIME 4
S1
S2
0.6
0.1
0.3
0.1
0.1
0.2
33
Viterbi Algorithm (Backtracking to Obtain Labeling)Viterbi Algorithm (Backtracking to Obtain Labeling)
S0
S1
S2
S1
S2
S1
S2
π1
π2
a12=0.3
a11=0.7
a22=0.5
a21=0.5
TIME 2 TIME 3 TIME 4
S1
S2
0.6
0.1
0.3
0.1
0.1
0.2
34
ImplementingImplementing HMMHMM to speech Modelingto speech Modeling
((TrainingTraining andand RecognitionRecognition ))
- Building HMM speech models based on the- Building HMM speech models based on the
correspondence between the observation sequencescorrespondence between the observation sequences
YY and the state sequence (and the state sequence (SS).). (TRAINNING).(TRAINNING).
- Recognizing speech by the stored HMM models- Recognizing speech by the stored HMM models ΘΘ
and by the actual observation Y.and by the actual observation Y.
(RECOGNITION)(RECOGNITION)
Training HMM
Feature
Extraction
Recognition
W*Y
Y
S
Speech
Samples
Θ
35
RECOGNITIONRECOGNITION ProcessProcess
 Given an input speechGiven an input speech S=(sS=(s11,s,s22,…,s,…,sTT)) be the recognized .be the recognized .
 xxtt be the feature samples computed at timebe the feature samples computed at time tt, where the feature, where the feature
sequence from timesequence from time 11 toto tt is indicated as:is indicated as: X=(xX=(x11,x,x22,…,x,…,xtt ))..
 The recognized statesThe recognized states S*S* could be obtained by:could be obtained by:
S*=ArgMax P(S,X|S*=ArgMax P(S,X|ΦΦ))..
Dynamic Structure
Search Algorithm
S*
Static Structure Φ
St , P(xt,{st}|{st-1},Φ)
}St-1{
xt
36
ResultResult ((SSpeakerpeaker RRecognition)ecognition)
37
Table 1: Speaker recognition result
ResultResult ((IIsolatedsolated SRSR))
38
Table 3: Result for isolated speech recognition.
ResultResult ((CContinuousontinuous SRSR))
39
Table 3: Continuous Speech recognition result
ConclusionsConclusions
 No speech recognizer till now has 100%No speech recognizer till now has 100%
accuracy.accuracy.
 You should avoided poor quality microphoneYou should avoided poor quality microphone
consider using a better microphoneconsider using a better microphone
 On important matter is that , training theOn important matter is that , training the
computer will provide an even better experience.computer will provide an even better experience.
40
ThankThank
YouYou
41

More Related Content

What's hot

free Videos lecture in India
free Videos lecture in Indiafree Videos lecture in India
free Videos lecture in IndiaEdhole.com
 
Tensor train to solve stochastic PDEs
Tensor train to solve stochastic PDEsTensor train to solve stochastic PDEs
Tensor train to solve stochastic PDEsAlexander Litvinenko
 
Admission in india
Admission  in indiaAdmission  in india
Admission in indiaEdhole.com
 
Admmission in India
Admmission in IndiaAdmmission in India
Admmission in IndiaEdhole.com
 
My presentation at University of Nottingham "Fast low-rank methods for solvin...
My presentation at University of Nottingham "Fast low-rank methods for solvin...My presentation at University of Nottingham "Fast low-rank methods for solvin...
My presentation at University of Nottingham "Fast low-rank methods for solvin...Alexander Litvinenko
 
ADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHM
ADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHMADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHM
ADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHMijfls
 
Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Pierre Jacob
 
Optimization toolbox presentation
Optimization toolbox presentationOptimization toolbox presentation
Optimization toolbox presentationRavi Kannappan
 
Low-rank tensor methods for stochastic forward and inverse problems
Low-rank tensor methods for stochastic forward and inverse problemsLow-rank tensor methods for stochastic forward and inverse problems
Low-rank tensor methods for stochastic forward and inverse problemsAlexander Litvinenko
 
Optimization
OptimizationOptimization
OptimizationManas Das
 
R/Finance 2009 Chicago
R/Finance 2009 ChicagoR/Finance 2009 Chicago
R/Finance 2009 Chicagogyollin
 
Conditional neural processes
Conditional neural processesConditional neural processes
Conditional neural processesKazuki Fujikawa
 
FPGA Implementation of A New Chien Search Block for Reed-Solomon Codes RS (25...
FPGA Implementation of A New Chien Search Block for Reed-Solomon Codes RS (25...FPGA Implementation of A New Chien Search Block for Reed-Solomon Codes RS (25...
FPGA Implementation of A New Chien Search Block for Reed-Solomon Codes RS (25...IJERA Editor
 
Convex optimization methods
Convex optimization methodsConvex optimization methods
Convex optimization methodsDong Guo
 

What's hot (19)

free Videos lecture in India
free Videos lecture in Indiafree Videos lecture in India
free Videos lecture in India
 
Tensor train to solve stochastic PDEs
Tensor train to solve stochastic PDEsTensor train to solve stochastic PDEs
Tensor train to solve stochastic PDEs
 
Admission in india
Admission  in indiaAdmission  in india
Admission in india
 
Admmission in India
Admmission in IndiaAdmmission in India
Admmission in India
 
My presentation at University of Nottingham "Fast low-rank methods for solvin...
My presentation at University of Nottingham "Fast low-rank methods for solvin...My presentation at University of Nottingham "Fast low-rank methods for solvin...
My presentation at University of Nottingham "Fast low-rank methods for solvin...
 
ADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHM
ADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHMADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHM
ADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHM
 
Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...Estimation of the score vector and observed information matrix in intractable...
Estimation of the score vector and observed information matrix in intractable...
 
Microchip Mfg. problem
Microchip Mfg. problemMicrochip Mfg. problem
Microchip Mfg. problem
 
Chap09alg
Chap09algChap09alg
Chap09alg
 
Optimization toolbox presentation
Optimization toolbox presentationOptimization toolbox presentation
Optimization toolbox presentation
 
Low-rank tensor methods for stochastic forward and inverse problems
Low-rank tensor methods for stochastic forward and inverse problemsLow-rank tensor methods for stochastic forward and inverse problems
Low-rank tensor methods for stochastic forward and inverse problems
 
Optimization
OptimizationOptimization
Optimization
 
R/Finance 2009 Chicago
R/Finance 2009 ChicagoR/Finance 2009 Chicago
R/Finance 2009 Chicago
 
Conditional neural processes
Conditional neural processesConditional neural processes
Conditional neural processes
 
FPGA Implementation of A New Chien Search Block for Reed-Solomon Codes RS (25...
FPGA Implementation of A New Chien Search Block for Reed-Solomon Codes RS (25...FPGA Implementation of A New Chien Search Block for Reed-Solomon Codes RS (25...
FPGA Implementation of A New Chien Search Block for Reed-Solomon Codes RS (25...
 
ARIC Team Seminar
ARIC Team SeminarARIC Team Seminar
ARIC Team Seminar
 
Slides ensae-2016-8
Slides ensae-2016-8Slides ensae-2016-8
Slides ensae-2016-8
 
Convex optimization methods
Convex optimization methodsConvex optimization methods
Convex optimization methods
 
PECCS 2014
PECCS 2014PECCS 2014
PECCS 2014
 

Similar to Performance analysis of bangla speech recognizer model using hmm

continious hmm.pdf
continious  hmm.pdfcontinious  hmm.pdf
continious hmm.pdfRahul Halder
 
Andrey Kuznetsov and Vladislav Myasnikov - Using Efficient Linear Local Feat...
Andrey Kuznetsov and  Vladislav Myasnikov - Using Efficient Linear Local Feat...Andrey Kuznetsov and  Vladislav Myasnikov - Using Efficient Linear Local Feat...
Andrey Kuznetsov and Vladislav Myasnikov - Using Efficient Linear Local Feat...AIST
 
Geohydrology ii (3)
Geohydrology ii (3)Geohydrology ii (3)
Geohydrology ii (3)Amro Elfeki
 
A Novel Method for Speaker Independent Recognition Based on Hidden Markov Model
A Novel Method for Speaker Independent Recognition Based on Hidden Markov ModelA Novel Method for Speaker Independent Recognition Based on Hidden Markov Model
A Novel Method for Speaker Independent Recognition Based on Hidden Markov ModelIDES Editor
 
"Speech recognition" - Hidden Markov Models @ Papers We Love Bucharest
"Speech recognition" - Hidden Markov Models @ Papers We Love Bucharest"Speech recognition" - Hidden Markov Models @ Papers We Love Bucharest
"Speech recognition" - Hidden Markov Models @ Papers We Love BucharestStefan Adam
 
Iberspeech2012
Iberspeech2012Iberspeech2012
Iberspeech2012joseangl
 
Hidden Markov Model in Natural Language Processing
Hidden Markov Model in Natural Language ProcessingHidden Markov Model in Natural Language Processing
Hidden Markov Model in Natural Language Processingsachinmaskeen211
 
19 algorithms-and-complexity-110627100203-phpapp02
19 algorithms-and-complexity-110627100203-phpapp0219 algorithms-and-complexity-110627100203-phpapp02
19 algorithms-and-complexity-110627100203-phpapp02Muhammad Aslam
 
2012 mdsp pr06  hmm
2012 mdsp pr06  hmm2012 mdsp pr06  hmm
2012 mdsp pr06  hmmnozomuhamada
 
Introduction to Algorithms and Asymptotic Notation
Introduction to Algorithms and Asymptotic NotationIntroduction to Algorithms and Asymptotic Notation
Introduction to Algorithms and Asymptotic NotationAmrinder Arora
 
Efficient Implementation of Self-Organizing Map for Sparse Input Data
Efficient Implementation of Self-Organizing Map for Sparse Input DataEfficient Implementation of Self-Organizing Map for Sparse Input Data
Efficient Implementation of Self-Organizing Map for Sparse Input Dataymelka
 
Rfid presentation in internet
Rfid presentation in internetRfid presentation in internet
Rfid presentation in internetAli Azarnia
 
Vu_HPSC2012_02.pptx
Vu_HPSC2012_02.pptxVu_HPSC2012_02.pptx
Vu_HPSC2012_02.pptxQucngV
 

Similar to Performance analysis of bangla speech recognizer model using hmm (20)

continious hmm.pdf
continious  hmm.pdfcontinious  hmm.pdf
continious hmm.pdf
 
Andrey Kuznetsov and Vladislav Myasnikov - Using Efficient Linear Local Feat...
Andrey Kuznetsov and  Vladislav Myasnikov - Using Efficient Linear Local Feat...Andrey Kuznetsov and  Vladislav Myasnikov - Using Efficient Linear Local Feat...
Andrey Kuznetsov and Vladislav Myasnikov - Using Efficient Linear Local Feat...
 
Geohydrology ii (3)
Geohydrology ii (3)Geohydrology ii (3)
Geohydrology ii (3)
 
Aocr Hmm Presentation
Aocr Hmm PresentationAocr Hmm Presentation
Aocr Hmm Presentation
 
A Novel Method for Speaker Independent Recognition Based on Hidden Markov Model
A Novel Method for Speaker Independent Recognition Based on Hidden Markov ModelA Novel Method for Speaker Independent Recognition Based on Hidden Markov Model
A Novel Method for Speaker Independent Recognition Based on Hidden Markov Model
 
"Speech recognition" - Hidden Markov Models @ Papers We Love Bucharest
"Speech recognition" - Hidden Markov Models @ Papers We Love Bucharest"Speech recognition" - Hidden Markov Models @ Papers We Love Bucharest
"Speech recognition" - Hidden Markov Models @ Papers We Love Bucharest
 
Learn Matlab
Learn MatlabLearn Matlab
Learn Matlab
 
Iberspeech2012
Iberspeech2012Iberspeech2012
Iberspeech2012
 
Hidden Markov Model in Natural Language Processing
Hidden Markov Model in Natural Language ProcessingHidden Markov Model in Natural Language Processing
Hidden Markov Model in Natural Language Processing
 
Seminar psu 20.10.2013
Seminar psu 20.10.2013Seminar psu 20.10.2013
Seminar psu 20.10.2013
 
19 algorithms-and-complexity-110627100203-phpapp02
19 algorithms-and-complexity-110627100203-phpapp0219 algorithms-and-complexity-110627100203-phpapp02
19 algorithms-and-complexity-110627100203-phpapp02
 
2012 mdsp pr06  hmm
2012 mdsp pr06  hmm2012 mdsp pr06  hmm
2012 mdsp pr06  hmm
 
Fine Grained Complexity
Fine Grained ComplexityFine Grained Complexity
Fine Grained Complexity
 
ASR_final
ASR_finalASR_final
ASR_final
 
presentazione
presentazionepresentazione
presentazione
 
Perm winter school 2014.01.31
Perm winter school 2014.01.31Perm winter school 2014.01.31
Perm winter school 2014.01.31
 
Introduction to Algorithms and Asymptotic Notation
Introduction to Algorithms and Asymptotic NotationIntroduction to Algorithms and Asymptotic Notation
Introduction to Algorithms and Asymptotic Notation
 
Efficient Implementation of Self-Organizing Map for Sparse Input Data
Efficient Implementation of Self-Organizing Map for Sparse Input DataEfficient Implementation of Self-Organizing Map for Sparse Input Data
Efficient Implementation of Self-Organizing Map for Sparse Input Data
 
Rfid presentation in internet
Rfid presentation in internetRfid presentation in internet
Rfid presentation in internet
 
Vu_HPSC2012_02.pptx
Vu_HPSC2012_02.pptxVu_HPSC2012_02.pptx
Vu_HPSC2012_02.pptx
 

More from Abdullah al Mamun

Underfitting and Overfitting in Machine Learning
Underfitting and Overfitting in Machine LearningUnderfitting and Overfitting in Machine Learning
Underfitting and Overfitting in Machine LearningAbdullah al Mamun
 
Recurrent Neural Networks (RNNs)
Recurrent Neural Networks (RNNs)Recurrent Neural Networks (RNNs)
Recurrent Neural Networks (RNNs)Abdullah al Mamun
 
Principal Component Analysis PCA
Principal Component Analysis PCAPrincipal Component Analysis PCA
Principal Component Analysis PCAAbdullah al Mamun
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)Abdullah al Mamun
 
Multilayer Perceptron Neural Network MLP
Multilayer Perceptron Neural Network MLPMultilayer Perceptron Neural Network MLP
Multilayer Perceptron Neural Network MLPAbdullah al Mamun
 
Ensemble Method (Bagging Boosting)
Ensemble Method (Bagging Boosting)Ensemble Method (Bagging Boosting)
Ensemble Method (Bagging Boosting)Abdullah al Mamun
 
Convolutional Neural Networks CNN
Convolutional Neural Networks CNNConvolutional Neural Networks CNN
Convolutional Neural Networks CNNAbdullah al Mamun
 
Artificial Neural Network ANN
Artificial Neural Network ANNArtificial Neural Network ANN
Artificial Neural Network ANNAbdullah al Mamun
 
Reinforcement Learning, Application and Q-Learning
Reinforcement Learning, Application and Q-LearningReinforcement Learning, Application and Q-Learning
Reinforcement Learning, Application and Q-LearningAbdullah al Mamun
 
Session on evaluation of DevSecOps
Session on evaluation of DevSecOpsSession on evaluation of DevSecOps
Session on evaluation of DevSecOpsAbdullah al Mamun
 
Artificial Intelligence: Classification, Applications, Opportunities, and Cha...
Artificial Intelligence: Classification, Applications, Opportunities, and Cha...Artificial Intelligence: Classification, Applications, Opportunities, and Cha...
Artificial Intelligence: Classification, Applications, Opportunities, and Cha...Abdullah al Mamun
 
Python Virtual Environment.pptx
Python Virtual Environment.pptxPython Virtual Environment.pptx
Python Virtual Environment.pptxAbdullah al Mamun
 
Artificial intelligence Presentation.pptx
Artificial intelligence Presentation.pptxArtificial intelligence Presentation.pptx
Artificial intelligence Presentation.pptxAbdullah al Mamun
 

More from Abdullah al Mamun (20)

Underfitting and Overfitting in Machine Learning
Underfitting and Overfitting in Machine LearningUnderfitting and Overfitting in Machine Learning
Underfitting and Overfitting in Machine Learning
 
Recurrent Neural Networks (RNNs)
Recurrent Neural Networks (RNNs)Recurrent Neural Networks (RNNs)
Recurrent Neural Networks (RNNs)
 
Random Forest
Random ForestRandom Forest
Random Forest
 
Principal Component Analysis PCA
Principal Component Analysis PCAPrincipal Component Analysis PCA
Principal Component Analysis PCA
 
Natural Language Processing (NLP)
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
 
Naive Bayes
Naive BayesNaive Bayes
Naive Bayes
 
Multilayer Perceptron Neural Network MLP
Multilayer Perceptron Neural Network MLPMultilayer Perceptron Neural Network MLP
Multilayer Perceptron Neural Network MLP
 
Long Short Term Memory LSTM
Long Short Term Memory LSTMLong Short Term Memory LSTM
Long Short Term Memory LSTM
 
Linear Regression
Linear RegressionLinear Regression
Linear Regression
 
K-Nearest Neighbor(KNN)
K-Nearest Neighbor(KNN)K-Nearest Neighbor(KNN)
K-Nearest Neighbor(KNN)
 
Hidden Markov Model (HMM)
Hidden Markov Model (HMM)Hidden Markov Model (HMM)
Hidden Markov Model (HMM)
 
Ensemble Method (Bagging Boosting)
Ensemble Method (Bagging Boosting)Ensemble Method (Bagging Boosting)
Ensemble Method (Bagging Boosting)
 
Convolutional Neural Networks CNN
Convolutional Neural Networks CNNConvolutional Neural Networks CNN
Convolutional Neural Networks CNN
 
Artificial Neural Network ANN
Artificial Neural Network ANNArtificial Neural Network ANN
Artificial Neural Network ANN
 
Reinforcement Learning, Application and Q-Learning
Reinforcement Learning, Application and Q-LearningReinforcement Learning, Application and Q-Learning
Reinforcement Learning, Application and Q-Learning
 
Session on evaluation of DevSecOps
Session on evaluation of DevSecOpsSession on evaluation of DevSecOps
Session on evaluation of DevSecOps
 
Artificial Intelligence: Classification, Applications, Opportunities, and Cha...
Artificial Intelligence: Classification, Applications, Opportunities, and Cha...Artificial Intelligence: Classification, Applications, Opportunities, and Cha...
Artificial Intelligence: Classification, Applications, Opportunities, and Cha...
 
DevOps Presentation.pptx
DevOps Presentation.pptxDevOps Presentation.pptx
DevOps Presentation.pptx
 
Python Virtual Environment.pptx
Python Virtual Environment.pptxPython Virtual Environment.pptx
Python Virtual Environment.pptx
 
Artificial intelligence Presentation.pptx
Artificial intelligence Presentation.pptxArtificial intelligence Presentation.pptx
Artificial intelligence Presentation.pptx
 

Recently uploaded

Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayEpec Engineered Technologies
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptMsecMca
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoorTop Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoordharasingh5698
 
22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf203318pmpc
 
Unit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfUnit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfRagavanV2
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756dollysharma2066
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.Kamal Acharya
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringmulugeta48
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueBhangaleSonal
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...soginsider
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Arindam Chakraborty, Ph.D., P.E. (CA, TX)
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfRagavanV2
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startQuintin Balsdon
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlysanyuktamishra911
 

Recently uploaded (20)

(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
 
Standard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power PlayStandard vs Custom Battery Packs - Decoding the Power Play
Standard vs Custom Battery Packs - Decoding the Power Play
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoorTop Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
Top Rated Call Girls In chittoor 📱 {7001035870} VIP Escorts chittoor
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf
 
Unit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfUnit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdf
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
Hazard Identification (HAZID) vs. Hazard and Operability (HAZOP): A Comparati...
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort ServiceCall Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghlyKubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
 

Performance analysis of bangla speech recognizer model using hmm

  • 1. Performance analysis of Bangla Speech Recognizer model using Hidden Markov Model (HMM) Submitted by: Md. Abdullah-al-MAMUN 1
  • 2. OUTLINEOUTLINE  What is speech recognition ?What is speech recognition ?  The Structure of ASRThe Structure of ASR  Speech DatabaseSpeech Database  Feature ExtractionFeature Extraction  Hidden Markov ModelHidden Markov Model  Forward algorithmForward algorithm  Backward algorithmBackward algorithm  Viterbi algorithmViterbi algorithm  Training & RecogntionTraining & Recogntion  ResultResult  ConclusionsConclusions  ReferencesReferences 2
  • 3. What isWhat is SSpeechpeech RRecognitionecognition??  In Computer Science, In Computer Science, Speech recognitionSpeech recognition is is the translation of spoken words into text .the translation of spoken words into text .  Process of converting acoustic signal capturedProcess of converting acoustic signal captured by microphone to a set of words.by microphone to a set of words.  Speech recognition known as “AutomaticSpeech recognition known as “Automatic Speech Recognition (ASR) ”, “Speech to TextSpeech Recognition (ASR) ”, “Speech to Text (STT)".(STT)". 3
  • 4. Model ofModel of BBanglaangla SSpeechpeech RRecognitionecognition 4 Fig -1 : Simple model of Bangla Speech Recognition
  • 5. Database Signal Interface Feature Extraction Recognition Databases Training HMM The Structure ofThe Structure of ASRASR System:System: Figure 1 :Functional Scheme of an ASR SystemFigure 1 :Functional Scheme of an ASR System Speech samples X Y S W* 5
  • 6. Speech Database:Speech Database: -A speech database is a collection ofA speech database is a collection of recorded speech accessible on a computerrecorded speech accessible on a computer and supported with the necessaryand supported with the necessary transcriptions.transcriptions. -The databases collect the observationsThe databases collect the observations required for parameter estimations.required for parameter estimations. -In this ASR system, I have used aboutIn this ASR system, I have used about 1200 keywords.1200 keywords. 6
  • 7. Classification of KeywordsClassification of Keywords Bengal Word Independent Dependent Vowel Consonant Modifier Character Compound Character 7
  • 9. Speech Signal AnalysisSpeech Signal Analysis Feature Extraction for ASR:Feature Extraction for ASR: - The aim is to extract the voice features to- The aim is to extract the voice features to distinguish different phonemes of a language.distinguish different phonemes of a language. 9 5 1 5 6 4 5 4 6 5 1 5 6 1 5 6 1 6 5 1 5 6 4 5 6 4 5 4 2 5 1 5 6 1 5 6 5 Feature Extraction
  • 10. MFCCMFCC extractionextraction Pre-emphasis DFT Mel filter banks Log(||2 ) IDFT Speech signal x(n) WINDOW x’ (n) xt (n) Xt(k) Yt(m) MFCC yt (m) (k) 10 MFCC means Mel-frequency cepstral coefficients that representation of the short-term power spectrum of a sound for audio processing. The MFCCs are the amplitudes of the resulting spectrum.
  • 11. Speech waveform of aSpeech waveform of a phoneme “ae”phoneme “ae” After pre-emphasis andAfter pre-emphasis and Hamming windowingHamming windowing Power spectrumPower spectrum MFCCMFCC Explanatory ExampleExplanatory Example 11
  • 12. FFeatureeature VVector toector to P(O|M)P(O|M) viavia HMMHMM 12 5 1 5 6 4 6 5 4 5 6 4 P(O|M)HMM For each input word O the HMM generate a corresponding probability P(O|M) that could be computed by the HMM.
  • 13. HMM ModelHMM Model 13 HMM is specified by a five-tuples =λ ( , , , , )S O A BΠ
  • 14. 14 Elements of an HMMElements of an HMM 1) Set of hidden states1) Set of hidden states S={1.2., … … N}S={1.2., … … N} 2) Set of observation symbols2) Set of observation symbols O={oO={o11, o, o22, … … o, … … oMM}} M: the number of observation symbolsM: the number of observation symbols 3) The initial state distribution3) The initial state distribution 4) State transition probability distribution4) State transition probability distribution 5) Observation symbol probability distribution in state j5) Observation symbol probability distribution in state j 1{ } ( | ), 1 ,ij ij t tA a a P s j s i i j N−= = = = ≤ ≤ { ( )} ( ) ( | ) 1 ,1j j t k tB b k b k P X o s j j N k M= = = = ≤ ≤ ≤ ≤ 0{ } ( ) 1i i P s i i Nπ π π= = = ≤ ≤
  • 15. 15 Three Basic Problems in HMMThree Basic Problems in HMM  1.The Evaluation Problem1.The Evaluation Problem –Given a model–Given a model λλ =(A, B, π)=(A, B, π) and aand a sequence of observations Osequence of observations O = (o= (o11, o, o22, o, o33,...o,...oMM )), what is the, what is the probability P(O|probability P(O|λλ); i.e., the probability of the model that); i.e., the probability of the model that generates the observations?generates the observations?  2.The Decoding Problem2.The Decoding Problem – Given a model– Given a model λλ =(A, B, π)=(A, B, π) and aand a sequence of observation Osequence of observation O = (o= (o11, o, o22, o, o33,...o,...oMM )), what is the, what is the most likely state sequence in the model that produces themost likely state sequence in the model that produces the observations?observations?  3.The Learning Problem3.The Learning Problem –Given a model–Given a model λλ =(A, B, π)=(A, B, π) and aand a set of observationsset of observations O = (oO = (o11, o, o22, o, o33,...o,...oMM )), how can we adjust, how can we adjust the model parameterthe model parameter λλ to maximize the joint probabilityto maximize the joint probability P(O|P(O|λλ)?)? How to evaluate an HMM? Forward Algorithm How to Decode an HMM? Viterbi Algorithm How to Train an HMM? Baum-Welch Algorithm
  • 16. 16 CalculateCalculate PProbabilityrobability ( O| M )( O| M ) Trellis: 0.5 0.3 0.2 P(up) P(down) P(no-change) 0.3 0.3 0.4 0.7 0.1 0.2 0.1 0.6 0.3 0.179 0.036 0.008 0.35 0.02 0.09 0.35*0.2*0.3 0.02*0.5*0.7 0.09*0.4*0.7 0.02*0.2*0.3 0.09*0.5*0.3 0.35*0.6*0.7 0.179*0.6*0.7 0.008*0.5*0.7 0.036*0.4*0.7 0.6 0.5 0.4 0.2 0.3 0.1 0.2 0.2 transition matrix 0.5 0.2230.46 add probabilities !
  • 17. Forward Calculations – OverviewForward Calculations – Overview S0 S1 S2 S1 S2 S1 S2 π1 π2 a12=0.3 a11=0.7 a22=0.5 a21=0.5 TIME 2 TIME 3 TIME 4 0.6 0.1 0.3 0.1 0.1 0.2 17
  • 18. Forward Calculations (t=2)Forward Calculations (t=2) S0 S1 S2 S1 S2 π1 π2 a12=0.3 a11=0.7 a22=0.5 a21=0.5 TIME 2 NOTE: that α1 (2)+ α2 (2) is the likelihood of the observation. 1 2 1 1 13 11 2 23 21 2 1 13 12 2 23 22 (1) 1 (1) 0 (2) (1) (1) 0.21 (2) (1) (1) 0.09 b a b a b a b a α α α α α α α α = = = + = = + = 0.6 0.1 0.3 0.1 0.1 0.2 18
  • 19. Forward Calculations (t=3)Forward Calculations (t=3) S0 S1 S2 S1 S2 S1 S2 π1 π2 a12=0.3 a11=0.7 a22=0.5 a21=0.5 TIME 2 TIME 3 TIME 4 α1(3) 0.6 0.1 0.3 0.1 0.1 0.2 19
  • 20. Forward Calculations (t=4)Forward Calculations (t=4) S0 S1 S2 S1 S2 S1 S2 π1 π2 a12=0.3 a11=0.7 a22=0.5 a21=0.5 TIME 2 TIME 3 TIME 4 S1 S2 0.6 0.1 0.3 0.1 0.1 0.2 20
  • 21. Forward Calculation ofForward Calculation of Likelihood FunctionLikelihood Function t=1 t=2 t=3 t=4 α1(t) 1.0 π1 =1 0.21 α1(1) a11 b13 +α2(1) a21 b23 0.0462 α1(2)a11 b12 +α2(2)a21 b12 0.021294 α2(t) 0.0 π2 =0 0.09 α1(1) a12 b13 +α2(1) a22 b23 0.0378 0.010206 L(t) p(K1… Kt) 1.0 α1(1) +α2(1) 0.3 α1(2) +α2(2) 0.084 α1(3) +α2(3) 0.0315 α1(4) +α2(4) 21
  • 22. Backward Calculations – OverviewBackward Calculations – Overview S0 S1 S2 S1 S2 S1 S2 π1 π2 a12=0.3 a11=0.7 a22=0.5 a21=0.5 TIME 2 TIME 3 TIME 4 0.6 0.1 0.3 0.1 0.1 0.2 22
  • 23. Backward Calculations (t=3)Backward Calculations (t=3) S1 S2 TIME 3 0.6 0.1 0.3 0.1 0.1 0.2 23
  • 24. Backward Calculations (t=2)Backward Calculations (t=2) S1 S2 S1 S2 TIME 2 TIME 3 TIME 4 a22=0.5 a11=0.7 a12=0.3 a21=0.5 NOTE: that β1 (2)+ β2 (2) is the likelihood the observation/word sequence. 1 2 1 2 1 1 11 12 2 12 12 2 1 21 22 2 22 22 (4) 1 (4) 1 (3) 0.6 (3) 0.1 (2) (3) (3) 0.045 (2) (3) (3) 0.245 a b a b a b a b β β β β β β β β β β = = = = = + = = + = 0.6 0.1 0.3 0.1 0.1 0.2 24
  • 25. Backward Calculations (t=1)Backward Calculations (t=1) S0 S1 S2 S1 S2 S1 S2 π1 π2 a12=0.3 a11=0.7 a21=0.5 TIME 2 TIME 3 TIME 4 0.6 0.1 0.3 0.1 0.1 0.2 25
  • 26. Backward Calculation ofBackward Calculation of Likelihood FunctionLikelihood Function t=1 t=2 t=3 t=4 β1(t) 0.0315 0.045 a11b11 β1(1) + + a12b21 β1(1) 0.6 b11 1 β2(t) 0.029 0.245 a11b11 β1(1) + + a12b21 β1(1) 0.1 b21 1 L(t) p(Kt… KT) 0.0315 π1 β1(1) + π2 β2(1) 0.290 β1(2) +β2(2) 0.7 β1(3) + β2(3) 1 26
  • 27. 27 CalculateCalculate maxmaxSS Prob.Prob. state sequencestate sequence SS 0.35 0.09 0.02 P(up) P(down) P(no-change) 0.3 0.3 0.4 0.7 0.1 0.2 0.1 0.6 0.3 0.147 0.021 0.007 0.35*0.2*0.3 0.02*0.5*0.7 0.09*0.4*0.7 0.02*0.2*0.3 0.09*0.5*0.3 0.35*0.6*0.7 0.147*0.6*0.7 0.007*0.5*0.7 0.021*0.4*0.7 0.5 0.2 0.3 best Select highest probability !
  • 28. Viterbi Algorithm – OverviewViterbi Algorithm – Overview S0 S1 S2 S1 S2 S1 S2 π1 π2 a12=0.3 a11=0.7 a22=0.5 a21=0.5 TIME 2 TIME 3 TIME 4 0.6 0.1 0.3 0.1 0.1 0.2 28
  • 29. Viterbi Algorithm (Forward Calculations t=2)Viterbi Algorithm (Forward Calculations t=2) S0 S1 S2 S1 S2 π1=1 π2=0 a12=0.3 a11=0.7 a22=0.5 a21=0.5 TIME 2 1 1 2 2 1 1 13 11 2 23 21 2 1 13 12 2 23 22 1 2 (1) 1 (1) 0 (2) max{ (1) , (1) } 0.21 (2) max{ (1) , (1) } 0.09 (2) 1 (2) 1 b a b a b a b a δ π δ π δ δ δ δ δ δ ψ ψ = = = = = = = = = = 0.6 0.1 0.3 0.1 0.1 0.2 29
  • 30. Viterbi Algorithm (Backtracking t=2)Viterbi Algorithm (Backtracking t=2) S0 S1 S2 S1 S2 π1=1 π2=0 a12=0.3 a11=0.7 a22=0.5 a21=0.5 TIME 2 1 1 2 2 1 1 13 11 2 23 21 2 1 13 12 2 23 22 1 2 (1) 1 (1) 0 (2) max{ (1) , (1) } 0.21 (2) max{ (1) , (1) } 0.09 (2) 1 (2) 1 b a b a b a b a δ π δ π δ δ δ δ δ δ ψ ψ = = = = = = = = = = 0.6 0.1 0.3 0.1 0.1 0.2 30
  • 31. Viterbi Algorithm (Forward Calculations)Viterbi Algorithm (Forward Calculations) S0 S1 S2 S1 S2 S1 S2 π1 π2 a12=0.3 a11=0.7 a22=0.5 a21=0.5 TIME 2 TIME 3 TIME 4 0.6 0.1 0.3 0.1 0.1 0.2 31
  • 32. Viterbi Algorithm (backtracking)Viterbi Algorithm (backtracking) S0 S1 S2 S1 S2 S1 S2 π1 π2 a12=0.3 a11=0.7 a22=0.5 a21=0.5 TIME 2 TIME 3 TIME 4 0.6 0.1 0.3 0.1 0.1 0.2 32
  • 33. Viterbi Algorithm (Forward Calculations t=4)Viterbi Algorithm (Forward Calculations t=4) S0 S1 S2 S1 S2 S1 S2 π1 π2 a12=0.3 a11=0.7 a22=0.5 a21=0.5 TIME 2 TIME 3 TIME 4 S1 S2 0.6 0.1 0.3 0.1 0.1 0.2 33
  • 34. Viterbi Algorithm (Backtracking to Obtain Labeling)Viterbi Algorithm (Backtracking to Obtain Labeling) S0 S1 S2 S1 S2 S1 S2 π1 π2 a12=0.3 a11=0.7 a22=0.5 a21=0.5 TIME 2 TIME 3 TIME 4 S1 S2 0.6 0.1 0.3 0.1 0.1 0.2 34
  • 35. ImplementingImplementing HMMHMM to speech Modelingto speech Modeling ((TrainingTraining andand RecognitionRecognition )) - Building HMM speech models based on the- Building HMM speech models based on the correspondence between the observation sequencescorrespondence between the observation sequences YY and the state sequence (and the state sequence (SS).). (TRAINNING).(TRAINNING). - Recognizing speech by the stored HMM models- Recognizing speech by the stored HMM models ΘΘ and by the actual observation Y.and by the actual observation Y. (RECOGNITION)(RECOGNITION) Training HMM Feature Extraction Recognition W*Y Y S Speech Samples Θ 35
  • 36. RECOGNITIONRECOGNITION ProcessProcess  Given an input speechGiven an input speech S=(sS=(s11,s,s22,…,s,…,sTT)) be the recognized .be the recognized .  xxtt be the feature samples computed at timebe the feature samples computed at time tt, where the feature, where the feature sequence from timesequence from time 11 toto tt is indicated as:is indicated as: X=(xX=(x11,x,x22,…,x,…,xtt ))..  The recognized statesThe recognized states S*S* could be obtained by:could be obtained by: S*=ArgMax P(S,X|S*=ArgMax P(S,X|ΦΦ)).. Dynamic Structure Search Algorithm S* Static Structure Φ St , P(xt,{st}|{st-1},Φ) }St-1{ xt 36
  • 38. ResultResult ((IIsolatedsolated SRSR)) 38 Table 3: Result for isolated speech recognition.
  • 39. ResultResult ((CContinuousontinuous SRSR)) 39 Table 3: Continuous Speech recognition result
  • 40. ConclusionsConclusions  No speech recognizer till now has 100%No speech recognizer till now has 100% accuracy.accuracy.  You should avoided poor quality microphoneYou should avoided poor quality microphone consider using a better microphoneconsider using a better microphone  On important matter is that , training theOn important matter is that , training the computer will provide an even better experience.computer will provide an even better experience. 40

Editor's Notes

  1. A