SlideShare una empresa de Scribd logo
1 de 7
Descargar para leer sin conexión
Computer Science & Engineering: An International Journal (CSEIJ), Vol. 3, No. 4, August 2013
DOI : 10.5121/cseij.2013.3403 21
AN EFFICIENT SPEECH RECOGNITION
SYSTEM
Suma Swamy1
and K.V Ramakrishnan
1
Department of Electronics and Communication Engineering, Research Scholar,
Anna University, Chennai
suma_swamy@yahoo.com, ramradhain@yahoo.com
ABSTRACT
This paper describes the development of an efficient speech recognition system using different
techniques such as Mel Frequency Cepstrum Coefficients (MFCC), Vector Quantization (VQ) and
Hidden Markov Model (HMM).
This paper explains how speaker recognition followed by speech recognition is used to recognize the
speech faster, efficiently and accurately. MFCC is used to extract the characteristics from the input
speech signal with respect to a particular word uttered by a particular speaker. Then HMM is used
on Quantized feature vectors to identify the word by evaluating the maximum log likelihood values
for the spoken word.
KEYWORDS
MFCC, VQ, HMM, log likelihood, DISTMIN.
1. INTRODUCTION
The idea of human machine interaction led to research in Speech recognition. Automatic speech
recognition uses the process and related technology for converting speech signals into a sequence
of words or other linguistic units by means of an algorithm implemented as a computer program.
Speech understanding systems presently are capable of understanding speech input for
vocabularies of thousands of words in operational environments. Speech signal conveys two
important types of information: (a) speech content and (b) The speaker identity. Speech
recognisers aim to extract the lexical information from the speech signal independently of the
speaker by reducing the inter-speaker variability. Speaker recognition is concerned with
extracting the identity of the person. [3]
Speaker identification allows the use of uttered speech to verify the speaker’s identity and control
access to secure services. Speech Recognition offers greater freedom to employ the physically
handicapped in several applications like manufacturing processes, medicine and telephone
network. Figure 1(a) shows the speech recognition system without speaker identification. Figure
1(b) shows how the speaker identification followed by speech recognition improves the
efficiency. With this approach, the database will be divided into smaller divisions (SP1 to SPn)
with respect to different speakers. Hence the speech recognition rate improves for the
corresponding speaker.
Computer Science & Engineering: An International Journal (CSEIJ), Vol. 3, No. 4, August 2013
22
Figure.1 (a) Speech recognition system without speaker identification
Figure.1 (b) Speaker identification followed by speech recognition
This paper focuses on the implementation of speaker identification and enhancement of speech
recognition using Hidden Markov Model (HMM) techniques. [1], [4]
2. HISTORY OF SPEECH RECOGNITION
Speech Recognition research has been ongoing for more than 80 years. Over that period there
have been at least 4 generations of approaches, and a 5th generation is being formulated based on
current research themes. To cover the complete history of speech recognition is beyond the scope
of this paper.
By 2001, computer speech recognition had reached 80% accuracy and no further progress was
reported till 2010. Speech recognition technology development began to edge back into the
Feature
Extraction
(MFCC)
Codebook
Generation
(VQLBG)
Database
Speech Recognition
(HMM) Recognized
Speech
Input
Speech
Feature
Extraction
(MFCC)
Codebook
Generation
(VQLBG)
Feature Matching
(DISTMIN)
Speech Recognition
(HMM)
Recognized
Speech
Identified/
Recognized
Speaker
Input
Speech
SP1
SP2
SPn
Matching
(DCoFeature
Extraction
(MFCC)
ISTMIN)
n
Database
Computer Science & Engineering: An International Journal (CSEIJ), Vol. 3, No. 4, August 2013
23
forefront with one major event: the arrival of the “Google Voice Search app for the iPhone”. In
2010, Google added “personalized recognition” to Voice Search on Android phones, so that the
software could record users’ voice searches and produce a more accurate speech model. The
company also added Voice Search to its Chrome Browser in mid-2011. Like Google’s Voice
Search, Siri relies on cloud-based processing. It draws on its knowledge about the speaker to
generate a contextual reply and responds to voice input. [2]
Parallel processing methods using combinations of HMMs and acoustic- phonetic approaches to
detect and correct linguistic irregularities are used to increase recognition decision reliability and
increase robustness for recognition of speech in noisy environment.
3. PROPOSED MODEL
The structure of proposed system consists of two modules as shown in figure 1(b).
• Speaker Identification
• Speech Recognition
3.1 Speaker Identification
Feature extraction is a process that extracts data from the voice signal that is unique for each
speaker. Mel Frequency Cepstral Coefficient (MFCC) technique is often used to create the
fingerprint of the sound files. The MFCC are based on the known variation of the human ear’s
critical bandwidth frequencies with filters spaced linearly at low frequencies and logarithmically
at high frequencies used to capture the important characteristics of speech. [6], [7], [8]
These extracted features are Vector quantized using Vector Quantization algorithm. Vector
Quantization (VQ) is used for feature extraction in both the training and testing phases. It is an
extremely efficient representation of spectral information in the speech signal by mapping the
vectors from large vector space to a finite number of regions in the space called clusters. [6], [8]
After feature extraction, feature matching involves the actual procedure to identify the unknown
speaker by comparing extracted features with the database using the DISTMIN algorithm.
3.2 Speech Recognition System
Hidden Markov Processes are the statistical models in which one tries to characterize the
statistical properties of the signal with the underlying assumption that a signal can be
characterized as a random parametric signal of which the parameters can be estimated in a precise
and well-defined manner. In order to implement an isolated word recognition system using
HMM, the following steps must be taken
(1) For each uttered word, a Markov model must be built using parameters that optimize the
observations of the word.
(2) Maximum likelihood model is calculated for the uttered word. [5], [9], [10], [11]
4. IMPLEMENTATION
The major modules used are as follows:
MFCC (Mel-scaled Frequency Cepstral Coefficients)
Computer Science & Engineering: An International Journal (CSEIJ), Vol. 3, No. 4, August 2013
24
• Mel-spaced Filter Bank
VQ (Vector Quantization)
HMM (Hidden Markov Model)
• Discrete-HMM Observation matrix
• Forward-Backward Algorithm
Figure. 2 new speakers sound is added to database by entering name and recording time
Figure 2 shows the screenshot of how new speaker is added into the database. Figure 3 shows the
screenshot of how the speech is recognized.
Figure. 3 Screen shot shows the demonstration of speech recognition system
Computer Science & Engineering: An International Journal (CSEIJ), Vol. 3, No. 4, August 2013
25
5. EXPERIMENTATION RESULTS
In the speaker identification phase, four different (2 male and 2 female) speakers are asked to
speak the same word ten times from the given list of words. The speakers are then asked to utter
the same words in a random order and the recognition results noted. The percentage recognition
of a speaker for these words is given in the table 1 and efficiency chart is shown in figure 5 for
the same. The overall efficiency of speaker identification system is 95%.
In speech recognition phase, the experiment is repeated ten times for each of the above words.
The resulting efficiency percentage and its corresponding efficiency chart are shown in table 2
and figure 6 respectively. The overall efficiency of a speech recognition system obtained is 98%.
Table 1. Speaker identification results
Words Female
Speaker
1
Female
Speaker
2
Male
Speaker
3
Male
Speaker
4
Computer 90% 100% 100% 90%
Read 100% 100% 100% 100%
Mobile 90% 100% 90% 90%
Man 100% 70% 100% 100%
Robo 80% 100% 100% 100%
Average
%
92% 94% 98% 96%
Figure. 4 Efficiency chart for Speaker Identification System
Table 2. Speech Recognition Results
Words Recognition %
Computer 99%
Read 100%
Mobile 96%
Man 100%
Robo 95%
Average % 98%
Computer Science & Engineering: An International Journal (CSEIJ), Vol. 3, No. 4, August 2013
26
Figure. 5. Efficiency Chart for Speech Recognition System
6. CONCLUSION
In the speaker identification phase, MFCC and Distance Minimum techniques have been used.
These two techniques provided more efficient speaker identification system. The speech
recognition phase uses the most efficient HMM Algorithm. It is found that Speaker recognition
module improves the efficiency of speech recognition scores. The coding of all the techniques
mentioned above has been done using MATLAB. It has been found that the combination of
MFCC and Distance Minimum algorithm gives the best performance and also accurate results in
most of the cases with an overall efficiency of 95%. The study also reveals that the HMM
algorithm is able to identify the most commonly used isolated word. As a result of this, speech
recognition system achieves 98% efficiency.
ACKNOWLEDGEMENTS
We acknowledge Visvesvaraya Technological University, Belgaum and Anna University,
Chennai for the encouragement and permission to publish this paper. We would like to thank the
Principal of Sir MVIT, Dr. M.S.Indira for her support. Our special thanks to Prof. Dilip.K.Sen,
HOD of CSE for his valuable suggestions from time to time.
REFERENCES
[1] Ronald M. Baecker, “Readings in human-computer interaction: toward the year 2000”, 1995.
[2.] Melanie Pinola, “Speech Recognition Through the Decades: How We Ended Up With Siri”,
PCWorld.
[3] Ganesh Tiwari, “Text Prompted Remote Speaker Authentication : Joint Speech and Speaker
Recognition/Verification System”.
[4] Dr.Ravi Sankar, Tanmoy Islam, Srikanth Mangayyagari, “Robust Speech/Speaker Recognition
Systems”.
[5] Bassam A.Q.Al-Qatab and Raja.N.Aninon, “Arabic Speech Recognition using Hidden Markov
Model ToolKit (HTK)”, IEEE Information Technology (ITSim), 2010,page 557-562.
[6] Ahsanul Kabir, Sheikh Mohammad Masudul Ahsan,”.Vector Quantization in Text Dependent
Automatic Speaker Recognition using Mel-Frequency Cepstrum Coefficient”, 6th WSEAS
Computer Science & Engineering: An International Journal (CSEIJ), Vol. 3, No. 4, August 2013
27
International Conference on circuits, systems, electronics, control & signal processing, Cairo,Egypt,
dec 29-31, 2007,page 352-355
[7] Lindasalwa Muda, Mumtaj Begam and Elamvazuthi.,”Voice Recognition Algorithms using Mel
Frequency Cepstral Coefficient (MFCC) and DTW Techniques “,Journal of Computing, Volume 2,
Issue 3, March 2010
[8] Mahdi Shaneh and Azizollah Taheri ,”Voice Command Recognition System based on MFCC and VQ
Algorithms”, World Academy of Science, Engineering and Technology Journal , 2009
[9] Remzi Serdar Kurcan, “Isolated word recognition from in-ear microphone data using hidden markov
models (hmm)”, Master’s Thesis, 2006.
[10] Nikolai Shokhirev ,”Hidden Markov Models “, 2010.
[11] L.R. Rabiner, “A tutorial on Hidden Markov Models and selected applications in Speech
Recognition”, Proceedings of the IEEE Journal, Feb 1989, Vol 77, Issue: 2.
[12] Suma Swamy, Manasa S, Mani Sharma, Nithya A.S, Roopa K.S and K.V Ramakrishnan, “An
Improved Speech Recognition System”, LNICST Springer Journal, 2013.
AUTHORS
1. Suma Swamy obtained her B.E (Electronics Engineering) in 1990 from Shivaji
University, Kolhapur, Maharashtra, and M.Tech (Electronics and Communication
Engineering) in 2005 from Visvesvaraya Technological University, Belgaum,
Karnataka. She is working as Associate Professor, Department of CSE, Sir M.
Visvesvaraya Institute of Technology, Bengaluru, India. She is Research Scholar in
the department of ECE, Anna University, Chennai, India. Her areas of interest are
Speech Recognition, Database Management Systems and Design of Algorithms.
2. Dr. K.V Ramakrishnan obtained his M.Sc. (Electronics) from Poona University in
1961 and Ph.D (Electronics) from Toulouse (France) in 1972. He worked as Scientist
in CEERI from 1962-1999 at different places. He was a Consultant for M/s. Servo
Electronics, Delhi in 1999. He was a Director for Research and Development, HOD
(ECE/TE/MCA) at Sir M. Visvesvaraya Institute of Technology, Bengaluru, India
from 1999 -2002. He was a HOD (ECE) at New Horizon College of Engineering,
Bangalore from 2002-03. He was a HOD(CSE/ISE) at Sir M. Visvesvaraya Institute of
Technology, Bengaluru, India from 2003- 2006. He was a Dean and Professor (E CE)
at CMR Institute of Technology Bangalore from 2006-2009. He also officiated as principal during 2007. He
is a Supervisor, Anna University Chennai, India. His area of research is Speech Processing and Embedded
Systems.

Más contenido relacionado

La actualidad más candente

Speaker Identification From Youtube Obtained Data
Speaker Identification From Youtube Obtained DataSpeaker Identification From Youtube Obtained Data
Speaker Identification From Youtube Obtained Datasipij
 
Arabic digits speech recognition and speaker identification in noisy environm...
Arabic digits speech recognition and speaker identification in noisy environm...Arabic digits speech recognition and speaker identification in noisy environm...
Arabic digits speech recognition and speaker identification in noisy environm...TELKOMNIKA JOURNAL
 
PUNJABI SPEECH SYNTHESIS SYSTEM USING HTK
PUNJABI SPEECH SYNTHESIS SYSTEM USING HTKPUNJABI SPEECH SYNTHESIS SYSTEM USING HTK
PUNJABI SPEECH SYNTHESIS SYSTEM USING HTKijistjournal
 
5215ijcseit01
5215ijcseit015215ijcseit01
5215ijcseit01ijcsit
 
An Improved Approach for Word Ambiguity Removal
An Improved Approach for Word Ambiguity RemovalAn Improved Approach for Word Ambiguity Removal
An Improved Approach for Word Ambiguity RemovalWaqas Tariq
 
Intelligent hands free speech based sms
Intelligent hands free speech based smsIntelligent hands free speech based sms
Intelligent hands free speech based smsKamal Spring
 
Speaker Identification & Verification Using MFCC & SVM
Speaker Identification & Verification Using MFCC & SVMSpeaker Identification & Verification Using MFCC & SVM
Speaker Identification & Verification Using MFCC & SVMIRJET Journal
 
Automatic speech emotion and speaker recognition based on hybrid gmm and ffbnn
Automatic speech emotion and speaker recognition based on hybrid gmm and ffbnnAutomatic speech emotion and speaker recognition based on hybrid gmm and ffbnn
Automatic speech emotion and speaker recognition based on hybrid gmm and ffbnnijcsa
 
Sentiment analysis by deep learning approaches
Sentiment analysis by deep learning approachesSentiment analysis by deep learning approaches
Sentiment analysis by deep learning approachesTELKOMNIKA JOURNAL
 
Hindi digits recognition system on speech data collected in different natural...
Hindi digits recognition system on speech data collected in different natural...Hindi digits recognition system on speech data collected in different natural...
Hindi digits recognition system on speech data collected in different natural...csandit
 
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition TechniqueA Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition TechniqueCSCJournals
 
Speech Recognized Automation System Using Speaker Identification through Wire...
Speech Recognized Automation System Using Speaker Identification through Wire...Speech Recognized Automation System Using Speaker Identification through Wire...
Speech Recognized Automation System Using Speaker Identification through Wire...IOSR Journals
 

La actualidad más candente (15)

Speaker Identification From Youtube Obtained Data
Speaker Identification From Youtube Obtained DataSpeaker Identification From Youtube Obtained Data
Speaker Identification From Youtube Obtained Data
 
Arabic digits speech recognition and speaker identification in noisy environm...
Arabic digits speech recognition and speaker identification in noisy environm...Arabic digits speech recognition and speaker identification in noisy environm...
Arabic digits speech recognition and speaker identification in noisy environm...
 
PUNJABI SPEECH SYNTHESIS SYSTEM USING HTK
PUNJABI SPEECH SYNTHESIS SYSTEM USING HTKPUNJABI SPEECH SYNTHESIS SYSTEM USING HTK
PUNJABI SPEECH SYNTHESIS SYSTEM USING HTK
 
Ijetcas14 426
Ijetcas14 426Ijetcas14 426
Ijetcas14 426
 
5215ijcseit01
5215ijcseit015215ijcseit01
5215ijcseit01
 
IRJET- Vocal Code
IRJET- Vocal CodeIRJET- Vocal Code
IRJET- Vocal Code
 
An Improved Approach for Word Ambiguity Removal
An Improved Approach for Word Ambiguity RemovalAn Improved Approach for Word Ambiguity Removal
An Improved Approach for Word Ambiguity Removal
 
Intelligent hands free speech based sms
Intelligent hands free speech based smsIntelligent hands free speech based sms
Intelligent hands free speech based sms
 
Speaker Identification & Verification Using MFCC & SVM
Speaker Identification & Verification Using MFCC & SVMSpeaker Identification & Verification Using MFCC & SVM
Speaker Identification & Verification Using MFCC & SVM
 
40120130406014 2
40120130406014 240120130406014 2
40120130406014 2
 
Automatic speech emotion and speaker recognition based on hybrid gmm and ffbnn
Automatic speech emotion and speaker recognition based on hybrid gmm and ffbnnAutomatic speech emotion and speaker recognition based on hybrid gmm and ffbnn
Automatic speech emotion and speaker recognition based on hybrid gmm and ffbnn
 
Sentiment analysis by deep learning approaches
Sentiment analysis by deep learning approachesSentiment analysis by deep learning approaches
Sentiment analysis by deep learning approaches
 
Hindi digits recognition system on speech data collected in different natural...
Hindi digits recognition system on speech data collected in different natural...Hindi digits recognition system on speech data collected in different natural...
Hindi digits recognition system on speech data collected in different natural...
 
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition TechniqueA Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
A Novel, Robust, Hierarchical, Text-Independent Speaker Recognition Technique
 
Speech Recognized Automation System Using Speaker Identification through Wire...
Speech Recognized Automation System Using Speaker Identification through Wire...Speech Recognized Automation System Using Speaker Identification through Wire...
Speech Recognized Automation System Using Speaker Identification through Wire...
 

Similar a AN EFFICIENT SPEECH RECOGNITION SYSTEM

Comparative Study of Different Techniques in Speaker Recognition: Review
Comparative Study of Different Techniques in Speaker Recognition: ReviewComparative Study of Different Techniques in Speaker Recognition: Review
Comparative Study of Different Techniques in Speaker Recognition: ReviewIJAEMSJORNAL
 
A comparison of different support vector machine kernels for artificial speec...
A comparison of different support vector machine kernels for artificial speec...A comparison of different support vector machine kernels for artificial speec...
A comparison of different support vector machine kernels for artificial speec...TELKOMNIKA JOURNAL
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...AIRCC Publishing Corporation
 
Real Time Speaker Identification System – Design, Implementation and Validation
Real Time Speaker Identification System – Design, Implementation and ValidationReal Time Speaker Identification System – Design, Implementation and Validation
Real Time Speaker Identification System – Design, Implementation and ValidationIDES Editor
 
Speaker identification under noisy conditions using hybrid convolutional neur...
Speaker identification under noisy conditions using hybrid convolutional neur...Speaker identification under noisy conditions using hybrid convolutional neur...
Speaker identification under noisy conditions using hybrid convolutional neur...IAESIJAI
 
Using AI to recognise person
Using AI to recognise personUsing AI to recognise person
Using AI to recognise personSolutionsPortal
 
An overview of speaker recognition by Bhusan Chettri.pdf
An overview of speaker recognition by Bhusan Chettri.pdfAn overview of speaker recognition by Bhusan Chettri.pdf
An overview of speaker recognition by Bhusan Chettri.pdfBhusan Chettri
 
Speech Recognized Automation System Using Speaker Identification through Wire...
Speech Recognized Automation System Using Speaker Identification through Wire...Speech Recognized Automation System Using Speaker Identification through Wire...
Speech Recognized Automation System Using Speaker Identification through Wire...IOSR Journals
 
Automatic Speaker Recognition and AI.pdf
Automatic Speaker Recognition and AI.pdfAutomatic Speaker Recognition and AI.pdf
Automatic Speaker Recognition and AI.pdfBhusan Chettri
 
Enhancing speaker verification accuracy with deep ensemble learning and inclu...
Enhancing speaker verification accuracy with deep ensemble learning and inclu...Enhancing speaker verification accuracy with deep ensemble learning and inclu...
Enhancing speaker verification accuracy with deep ensemble learning and inclu...IJECEIAES
 
A Review On Speech Feature Techniques And Classification Techniques
A Review On Speech Feature Techniques And Classification TechniquesA Review On Speech Feature Techniques And Classification Techniques
A Review On Speech Feature Techniques And Classification TechniquesNicole Heredia
 
A Robust Speaker Identification System
A Robust Speaker Identification SystemA Robust Speaker Identification System
A Robust Speaker Identification Systemijtsrd
 
A novel automatic voice recognition system based on text-independent in a noi...
A novel automatic voice recognition system based on text-independent in a noi...A novel automatic voice recognition system based on text-independent in a noi...
A novel automatic voice recognition system based on text-independent in a noi...IJECEIAES
 
E0502 01 2327
E0502 01 2327E0502 01 2327
E0502 01 2327IJMER
 
Intelligent Arabic letters speech recognition system based on mel frequency c...
Intelligent Arabic letters speech recognition system based on mel frequency c...Intelligent Arabic letters speech recognition system based on mel frequency c...
Intelligent Arabic letters speech recognition system based on mel frequency c...IJECEIAES
 
Effect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal AlignmentsEffect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal Alignmentskevig
 
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...IDES Editor
 
Speech emotion recognition with light gradient boosting decision trees machine
Speech emotion recognition with light gradient boosting decision trees machineSpeech emotion recognition with light gradient boosting decision trees machine
Speech emotion recognition with light gradient boosting decision trees machineIJECEIAES
 

Similar a AN EFFICIENT SPEECH RECOGNITION SYSTEM (20)

Comparative Study of Different Techniques in Speaker Recognition: Review
Comparative Study of Different Techniques in Speaker Recognition: ReviewComparative Study of Different Techniques in Speaker Recognition: Review
Comparative Study of Different Techniques in Speaker Recognition: Review
 
A comparison of different support vector machine kernels for artificial speec...
A comparison of different support vector machine kernels for artificial speec...A comparison of different support vector machine kernels for artificial speec...
A comparison of different support vector machine kernels for artificial speec...
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
 
Real Time Speaker Identification System – Design, Implementation and Validation
Real Time Speaker Identification System – Design, Implementation and ValidationReal Time Speaker Identification System – Design, Implementation and Validation
Real Time Speaker Identification System – Design, Implementation and Validation
 
De4201715719
De4201715719De4201715719
De4201715719
 
Speaker identification under noisy conditions using hybrid convolutional neur...
Speaker identification under noisy conditions using hybrid convolutional neur...Speaker identification under noisy conditions using hybrid convolutional neur...
Speaker identification under noisy conditions using hybrid convolutional neur...
 
Using AI to recognise person
Using AI to recognise personUsing AI to recognise person
Using AI to recognise person
 
An overview of speaker recognition by Bhusan Chettri.pdf
An overview of speaker recognition by Bhusan Chettri.pdfAn overview of speaker recognition by Bhusan Chettri.pdf
An overview of speaker recognition by Bhusan Chettri.pdf
 
Speech Recognized Automation System Using Speaker Identification through Wire...
Speech Recognized Automation System Using Speaker Identification through Wire...Speech Recognized Automation System Using Speaker Identification through Wire...
Speech Recognized Automation System Using Speaker Identification through Wire...
 
Automatic Speaker Recognition and AI.pdf
Automatic Speaker Recognition and AI.pdfAutomatic Speaker Recognition and AI.pdf
Automatic Speaker Recognition and AI.pdf
 
Enhancing speaker verification accuracy with deep ensemble learning and inclu...
Enhancing speaker verification accuracy with deep ensemble learning and inclu...Enhancing speaker verification accuracy with deep ensemble learning and inclu...
Enhancing speaker verification accuracy with deep ensemble learning and inclu...
 
A Review On Speech Feature Techniques And Classification Techniques
A Review On Speech Feature Techniques And Classification TechniquesA Review On Speech Feature Techniques And Classification Techniques
A Review On Speech Feature Techniques And Classification Techniques
 
A Robust Speaker Identification System
A Robust Speaker Identification SystemA Robust Speaker Identification System
A Robust Speaker Identification System
 
A novel automatic voice recognition system based on text-independent in a noi...
A novel automatic voice recognition system based on text-independent in a noi...A novel automatic voice recognition system based on text-independent in a noi...
A novel automatic voice recognition system based on text-independent in a noi...
 
E0502 01 2327
E0502 01 2327E0502 01 2327
E0502 01 2327
 
Intelligent Arabic letters speech recognition system based on mel frequency c...
Intelligent Arabic letters speech recognition system based on mel frequency c...Intelligent Arabic letters speech recognition system based on mel frequency c...
Intelligent Arabic letters speech recognition system based on mel frequency c...
 
Effect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal AlignmentsEffect of MFCC Based Features for Speech Signal Alignments
Effect of MFCC Based Features for Speech Signal Alignments
 
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
Effect of Time Derivatives of MFCC Features on HMM Based Speech Recognition S...
 
Speaker Recognition Using Vocal Tract Features
Speaker Recognition Using Vocal Tract FeaturesSpeaker Recognition Using Vocal Tract Features
Speaker Recognition Using Vocal Tract Features
 
Speech emotion recognition with light gradient boosting decision trees machine
Speech emotion recognition with light gradient boosting decision trees machineSpeech emotion recognition with light gradient boosting decision trees machine
Speech emotion recognition with light gradient boosting decision trees machine
 

Último

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 

Último (20)

Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 

AN EFFICIENT SPEECH RECOGNITION SYSTEM

  • 1. Computer Science & Engineering: An International Journal (CSEIJ), Vol. 3, No. 4, August 2013 DOI : 10.5121/cseij.2013.3403 21 AN EFFICIENT SPEECH RECOGNITION SYSTEM Suma Swamy1 and K.V Ramakrishnan 1 Department of Electronics and Communication Engineering, Research Scholar, Anna University, Chennai suma_swamy@yahoo.com, ramradhain@yahoo.com ABSTRACT This paper describes the development of an efficient speech recognition system using different techniques such as Mel Frequency Cepstrum Coefficients (MFCC), Vector Quantization (VQ) and Hidden Markov Model (HMM). This paper explains how speaker recognition followed by speech recognition is used to recognize the speech faster, efficiently and accurately. MFCC is used to extract the characteristics from the input speech signal with respect to a particular word uttered by a particular speaker. Then HMM is used on Quantized feature vectors to identify the word by evaluating the maximum log likelihood values for the spoken word. KEYWORDS MFCC, VQ, HMM, log likelihood, DISTMIN. 1. INTRODUCTION The idea of human machine interaction led to research in Speech recognition. Automatic speech recognition uses the process and related technology for converting speech signals into a sequence of words or other linguistic units by means of an algorithm implemented as a computer program. Speech understanding systems presently are capable of understanding speech input for vocabularies of thousands of words in operational environments. Speech signal conveys two important types of information: (a) speech content and (b) The speaker identity. Speech recognisers aim to extract the lexical information from the speech signal independently of the speaker by reducing the inter-speaker variability. Speaker recognition is concerned with extracting the identity of the person. [3] Speaker identification allows the use of uttered speech to verify the speaker’s identity and control access to secure services. Speech Recognition offers greater freedom to employ the physically handicapped in several applications like manufacturing processes, medicine and telephone network. Figure 1(a) shows the speech recognition system without speaker identification. Figure 1(b) shows how the speaker identification followed by speech recognition improves the efficiency. With this approach, the database will be divided into smaller divisions (SP1 to SPn) with respect to different speakers. Hence the speech recognition rate improves for the corresponding speaker.
  • 2. Computer Science & Engineering: An International Journal (CSEIJ), Vol. 3, No. 4, August 2013 22 Figure.1 (a) Speech recognition system without speaker identification Figure.1 (b) Speaker identification followed by speech recognition This paper focuses on the implementation of speaker identification and enhancement of speech recognition using Hidden Markov Model (HMM) techniques. [1], [4] 2. HISTORY OF SPEECH RECOGNITION Speech Recognition research has been ongoing for more than 80 years. Over that period there have been at least 4 generations of approaches, and a 5th generation is being formulated based on current research themes. To cover the complete history of speech recognition is beyond the scope of this paper. By 2001, computer speech recognition had reached 80% accuracy and no further progress was reported till 2010. Speech recognition technology development began to edge back into the Feature Extraction (MFCC) Codebook Generation (VQLBG) Database Speech Recognition (HMM) Recognized Speech Input Speech Feature Extraction (MFCC) Codebook Generation (VQLBG) Feature Matching (DISTMIN) Speech Recognition (HMM) Recognized Speech Identified/ Recognized Speaker Input Speech SP1 SP2 SPn Matching (DCoFeature Extraction (MFCC) ISTMIN) n Database
  • 3. Computer Science & Engineering: An International Journal (CSEIJ), Vol. 3, No. 4, August 2013 23 forefront with one major event: the arrival of the “Google Voice Search app for the iPhone”. In 2010, Google added “personalized recognition” to Voice Search on Android phones, so that the software could record users’ voice searches and produce a more accurate speech model. The company also added Voice Search to its Chrome Browser in mid-2011. Like Google’s Voice Search, Siri relies on cloud-based processing. It draws on its knowledge about the speaker to generate a contextual reply and responds to voice input. [2] Parallel processing methods using combinations of HMMs and acoustic- phonetic approaches to detect and correct linguistic irregularities are used to increase recognition decision reliability and increase robustness for recognition of speech in noisy environment. 3. PROPOSED MODEL The structure of proposed system consists of two modules as shown in figure 1(b). • Speaker Identification • Speech Recognition 3.1 Speaker Identification Feature extraction is a process that extracts data from the voice signal that is unique for each speaker. Mel Frequency Cepstral Coefficient (MFCC) technique is often used to create the fingerprint of the sound files. The MFCC are based on the known variation of the human ear’s critical bandwidth frequencies with filters spaced linearly at low frequencies and logarithmically at high frequencies used to capture the important characteristics of speech. [6], [7], [8] These extracted features are Vector quantized using Vector Quantization algorithm. Vector Quantization (VQ) is used for feature extraction in both the training and testing phases. It is an extremely efficient representation of spectral information in the speech signal by mapping the vectors from large vector space to a finite number of regions in the space called clusters. [6], [8] After feature extraction, feature matching involves the actual procedure to identify the unknown speaker by comparing extracted features with the database using the DISTMIN algorithm. 3.2 Speech Recognition System Hidden Markov Processes are the statistical models in which one tries to characterize the statistical properties of the signal with the underlying assumption that a signal can be characterized as a random parametric signal of which the parameters can be estimated in a precise and well-defined manner. In order to implement an isolated word recognition system using HMM, the following steps must be taken (1) For each uttered word, a Markov model must be built using parameters that optimize the observations of the word. (2) Maximum likelihood model is calculated for the uttered word. [5], [9], [10], [11] 4. IMPLEMENTATION The major modules used are as follows: MFCC (Mel-scaled Frequency Cepstral Coefficients)
  • 4. Computer Science & Engineering: An International Journal (CSEIJ), Vol. 3, No. 4, August 2013 24 • Mel-spaced Filter Bank VQ (Vector Quantization) HMM (Hidden Markov Model) • Discrete-HMM Observation matrix • Forward-Backward Algorithm Figure. 2 new speakers sound is added to database by entering name and recording time Figure 2 shows the screenshot of how new speaker is added into the database. Figure 3 shows the screenshot of how the speech is recognized. Figure. 3 Screen shot shows the demonstration of speech recognition system
  • 5. Computer Science & Engineering: An International Journal (CSEIJ), Vol. 3, No. 4, August 2013 25 5. EXPERIMENTATION RESULTS In the speaker identification phase, four different (2 male and 2 female) speakers are asked to speak the same word ten times from the given list of words. The speakers are then asked to utter the same words in a random order and the recognition results noted. The percentage recognition of a speaker for these words is given in the table 1 and efficiency chart is shown in figure 5 for the same. The overall efficiency of speaker identification system is 95%. In speech recognition phase, the experiment is repeated ten times for each of the above words. The resulting efficiency percentage and its corresponding efficiency chart are shown in table 2 and figure 6 respectively. The overall efficiency of a speech recognition system obtained is 98%. Table 1. Speaker identification results Words Female Speaker 1 Female Speaker 2 Male Speaker 3 Male Speaker 4 Computer 90% 100% 100% 90% Read 100% 100% 100% 100% Mobile 90% 100% 90% 90% Man 100% 70% 100% 100% Robo 80% 100% 100% 100% Average % 92% 94% 98% 96% Figure. 4 Efficiency chart for Speaker Identification System Table 2. Speech Recognition Results Words Recognition % Computer 99% Read 100% Mobile 96% Man 100% Robo 95% Average % 98%
  • 6. Computer Science & Engineering: An International Journal (CSEIJ), Vol. 3, No. 4, August 2013 26 Figure. 5. Efficiency Chart for Speech Recognition System 6. CONCLUSION In the speaker identification phase, MFCC and Distance Minimum techniques have been used. These two techniques provided more efficient speaker identification system. The speech recognition phase uses the most efficient HMM Algorithm. It is found that Speaker recognition module improves the efficiency of speech recognition scores. The coding of all the techniques mentioned above has been done using MATLAB. It has been found that the combination of MFCC and Distance Minimum algorithm gives the best performance and also accurate results in most of the cases with an overall efficiency of 95%. The study also reveals that the HMM algorithm is able to identify the most commonly used isolated word. As a result of this, speech recognition system achieves 98% efficiency. ACKNOWLEDGEMENTS We acknowledge Visvesvaraya Technological University, Belgaum and Anna University, Chennai for the encouragement and permission to publish this paper. We would like to thank the Principal of Sir MVIT, Dr. M.S.Indira for her support. Our special thanks to Prof. Dilip.K.Sen, HOD of CSE for his valuable suggestions from time to time. REFERENCES [1] Ronald M. Baecker, “Readings in human-computer interaction: toward the year 2000”, 1995. [2.] Melanie Pinola, “Speech Recognition Through the Decades: How We Ended Up With Siri”, PCWorld. [3] Ganesh Tiwari, “Text Prompted Remote Speaker Authentication : Joint Speech and Speaker Recognition/Verification System”. [4] Dr.Ravi Sankar, Tanmoy Islam, Srikanth Mangayyagari, “Robust Speech/Speaker Recognition Systems”. [5] Bassam A.Q.Al-Qatab and Raja.N.Aninon, “Arabic Speech Recognition using Hidden Markov Model ToolKit (HTK)”, IEEE Information Technology (ITSim), 2010,page 557-562. [6] Ahsanul Kabir, Sheikh Mohammad Masudul Ahsan,”.Vector Quantization in Text Dependent Automatic Speaker Recognition using Mel-Frequency Cepstrum Coefficient”, 6th WSEAS
  • 7. Computer Science & Engineering: An International Journal (CSEIJ), Vol. 3, No. 4, August 2013 27 International Conference on circuits, systems, electronics, control & signal processing, Cairo,Egypt, dec 29-31, 2007,page 352-355 [7] Lindasalwa Muda, Mumtaj Begam and Elamvazuthi.,”Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and DTW Techniques “,Journal of Computing, Volume 2, Issue 3, March 2010 [8] Mahdi Shaneh and Azizollah Taheri ,”Voice Command Recognition System based on MFCC and VQ Algorithms”, World Academy of Science, Engineering and Technology Journal , 2009 [9] Remzi Serdar Kurcan, “Isolated word recognition from in-ear microphone data using hidden markov models (hmm)”, Master’s Thesis, 2006. [10] Nikolai Shokhirev ,”Hidden Markov Models “, 2010. [11] L.R. Rabiner, “A tutorial on Hidden Markov Models and selected applications in Speech Recognition”, Proceedings of the IEEE Journal, Feb 1989, Vol 77, Issue: 2. [12] Suma Swamy, Manasa S, Mani Sharma, Nithya A.S, Roopa K.S and K.V Ramakrishnan, “An Improved Speech Recognition System”, LNICST Springer Journal, 2013. AUTHORS 1. Suma Swamy obtained her B.E (Electronics Engineering) in 1990 from Shivaji University, Kolhapur, Maharashtra, and M.Tech (Electronics and Communication Engineering) in 2005 from Visvesvaraya Technological University, Belgaum, Karnataka. She is working as Associate Professor, Department of CSE, Sir M. Visvesvaraya Institute of Technology, Bengaluru, India. She is Research Scholar in the department of ECE, Anna University, Chennai, India. Her areas of interest are Speech Recognition, Database Management Systems and Design of Algorithms. 2. Dr. K.V Ramakrishnan obtained his M.Sc. (Electronics) from Poona University in 1961 and Ph.D (Electronics) from Toulouse (France) in 1972. He worked as Scientist in CEERI from 1962-1999 at different places. He was a Consultant for M/s. Servo Electronics, Delhi in 1999. He was a Director for Research and Development, HOD (ECE/TE/MCA) at Sir M. Visvesvaraya Institute of Technology, Bengaluru, India from 1999 -2002. He was a HOD (ECE) at New Horizon College of Engineering, Bangalore from 2002-03. He was a HOD(CSE/ISE) at Sir M. Visvesvaraya Institute of Technology, Bengaluru, India from 2003- 2006. He was a Dean and Professor (E CE) at CMR Institute of Technology Bangalore from 2006-2009. He also officiated as principal during 2007. He is a Supervisor, Anna University Chennai, India. His area of research is Speech Processing and Embedded Systems.