SlideShare una empresa de Scribd logo
1 de 5
Descargar para leer sin conexión
International Journal of Research in Computer Science
eISSN 2249-8265 Volume 3 Issue 5 (2013) pp. 13-17
www.ijorcs.org, A Unit of White Globe Publications
doi: 10.7815/ijorcs. 35.2013.070
www.ijorcs.org
VOICE RECOGNITION SYSTEM USING TEMPLATE
MATCHING
Luqman Gbadamosi
Computer Science Department, Lagos State Polytechnic, Lagos, Nigeria
Email: luqmangbadamosi@yahoo.com
Abstract: It is easy for human to recognize familiar
voice but using computer programs to identify a voice
when compared with others is a herculean task. This
is due to the problem that is encountered when
developing the algorithm to recognize human voice. It
is impossible to say a word the same way in two
different occasions. Human speech analysis by
computer gives different interpretation based on
varying speed of speech delivery. This research paper
gives detail description of the process behind
implementation of an effective voice recognition
algorithm. The algorithm utilize discrete Fourier
transform to compare the frequency spectra of two
voice samples because it remained unchanged as
speech is slightly varied. Chebyshev inequality is then
used to determine whether the two voices came from
the same person. The algorithm is implemented and
tested using MATLAB.
Keywords: chebyshev’s inequality, discrete fourier
transform, frequency spectra, voice recognition.
I. INTRODUCTION
Voice Recognition or Voice Authentication is an
automated method of identification of the person who
is speaking by the characteristics of their voice
biometrics. Voice is one of many forms of biometrics
used to identify an individual and verify their identity.
Naturally human can recognize a familiar voice but
getting computer to do the same is more difficult task.
This is due to the fact that it is impossible to say a
word exactly the same way on two different
occasions. Advancement in computing capabilities
has led to a more effective way of recognizing human
voice using feature extraction. Voice recognition
system is one of the best and highly effective
biometrics technique which could be used for
telephone banking and forensic investigation by law
enforcement agency. [9][10]
A. What is Human Voice?
The voice is made up of sound made by human
being using vocal folds for talking, singing, laughing,
crying, screaming etc. The human voice is specifically
that part of human sound production in which the
vocal folds are the primary sound source. The
mechanism for generating the human voice can be
subdivided into three; the lungs, the vocal folds within
the larynx, and the articulators. [11]
Figure1:Thespectrogramofhuman voicerevealsits reach
harmonic content.
B. What is Voice Recognition?
Voice Recognition (sometimes referred to as
Speaker Recognition) is the identification of the
person who is speaking by extracting the feature of
their voices when a questioned voice print is
compared against a known voice print. This
technology involves sounds, words or phrases spoken
by humans are converted into electrical signals, and
these signals are transformed into coding patterns to
which meaning has been assigned. There are two
major applications of voice recognition technologies
and methodologies. The first is voice verification or
authentication which is used to verify the speaker
claims to be of a certain identity and the voice is used
to verify this claim. The second is voice identification
which is the task of determining an unknown
speaker’s identity. In a better perspective, voice
verification is one to one matching where one
speaker’s voice is matched to one template or voice
print, whereas voice identification is one to many
matching where the speaker’s voice is compared
against many voice templates.
Speaker recognition system has two phases:
Enrollment and Verification. During enrollment, the
speaker’s voice is recorded and typically a number of
features are extracted to form a voice print or
template. In the verification phase, a speech sample or
“utterance” is compared against a previously created
14 Luqman Gbadamosi
www.ijorcs.org
voice print. For identification systems, the utterance is
compared against multiple voice prints in order to
determine the best match while verification systems
compare an utterance against a single voice print.
Voice Recognition Systems can also be categorized
into two: text independent and text dependent. [9]
Text-Dependent: This means text must be the same
for the enrollment and verification. The use of shared-
secret passwords and PINs or knowledge-based
information can be employed in order to create a
multi-factor authentication scenario.
Text Independent: Text-Independent systems are most
often used for speaker identification as they require
very little cooperation by the speaker. In this case the
text used during enrollment is different from the text
during verification. In fact, the enrollment may
happen without the user’s knowledge, as in the case
for many forensic applications. [9]
C. Voice Recognition Techniques
The most common approaches to voice recognition
can be divided into two classes: Template Matching
and Feature Analysis.
Template Matching: Template matching is the
simplest technique and has the highest accuracy when
used properly, but it also suffers from the most
limitations. As with any approach to voice
recognition, the first step is for the user to speak a
word or phrase into a microphone. The electrical
signal from the microphone is digitized by an
"analog-to-digital (A/D) converter", and is stored in
memory. To determine the "meaning" of this voice
input, the computer attempts to match the input with a
digitized voice sample, or template that has a known
meaning. This technique is a close analogy to the
traditional command inputs from a keyboard. The
program contains the input template, and attempts to
match this template with the actual input using a
simple conditional statement. This type of system is
known as "speaker dependent." and recognition
accuracy can be about 98 percent.
Feature Analysis: A more general form of voice
recognition is available through feature analysis and
this technique usually leads to "speaker-independent"
voice recognition. Instead of trying to find an exact or
near-exact match between the actual voice input and a
previously stored voice template, this method first
processes the voice input using "Fourier transforms"
or "linear predictive coding (LPC)", then attempts to
find characteristic similarities between the expected
inputs and the actual digitized voice input. These
similarities will be present for a wide range of
speakers, and so the system need not be trained by
each new user. The types of speech differences that
the speaker-independent method can deal with, but
which pattern matching would fail to handle, include
accents, and varying speed of delivery, pitch, volume,
and inflection. Speaker-independent speech
recognition has proven to be very difficult, with some
of the greatest hurdles being the variety of accents and
inflections used by speakers of different nationalities.
Recognition accuracy for speaker-independent
systems is somewhat less than for speaker-dependent
systems, usually between 90 and 95 percent. [12]
I have implemented template matching technique.
This approach has been intensively studied and is also
the back bone of most voice recognition products in
the market.
II. IMPLEMENTATION
A. Design Description
The voice recognition system using template
matching technique require the user to first create a
template for matching comparison by first recording
10 samples of the speaker’s voice by calling a phrase
which is going to be the known voice. Thereafter, the
questioned speaker’s voice can now be recorded
which would now be further analyzed using Discrete
Fourier Transform.
Discrete Fourier Transform: Voice recognition in
time domain would be extremely be impractical based
on the difficulties explained above. Instead an
analysis in frequency spectra in a voice which remain
predominately unchanged as speech is slightly varied
turn out to be a more viable option. The conversion of
all the recording into frequency domain is done using
discrete Fourier transform greatly simplified the
process of comparing two recordings. [3][6]
Finding the Norm: Due to the nature of human speech
all the data pertaining to frequency above 600Hz is
safely discarded. Therefore, once a recording is
converted into frequency domain, it could then be
simply regarded as a vector in 600-dimensional
Euclidean space. At this point, a comparison between
two vectors could easily be carried out by normalizing
the vectors (giving them length 1) then computing the
norm of the difference between the two (of course, the
difference between two vectors in R600 is performed
by subtracting component wise). Unfortunately,
exactly which norm to use is not immediately clear?
After carefully comparing and contrasting the use of
the Taxicab, Euclidean, and Maximum norms.[13]
It became clear that the Euclidean norm most
accurately measured the closeness between different
frequency spectra. Once the norm function was
chosen, all that remained was to decide exactly how
small the norm of the difference of two vectors had to
be in order to determine that both recordings
originated from the same person.
Voice Recognition System using Template Matching 15
www.ijorcs.org
Chebyshev's Inequality: Chebyshev’s inequality says
that at least 1-1/K2 of data from a sample must fall
within K standard deviations from the mean,
where K is any positive real number greater than one.
To illustrate the inequality, we will look at it for a few
values of K:
− For K = 2 we have 1 – 1/K2
= 1 - 1/4 = 3/4 = 75%.
So Chebyshev’s inequality says that at least 75%
of the data values of any distribution must be
within two standard deviations of the mean.
− For K = 3 we have 1 – 1/K2
= 1 - 1/9 = 8/9 = 89%.
So Chebyshev’s inequality says that at least 89%
of the data values of any distribution must be
within three standard deviations of the mean.
− For K = 4 we have 1 – 1/K2
= 1 - 1/16 = 15/16 =
93.75%. So Chebyshev’s inequality says that at
least 93.75% of the data values of any distribution
must be within four standard deviations of the
mean.[13]
Template Matching: The above analysis has revealed
that Chebyshe v's Inequality states that in particular,
at least 3/4 of all measurements from the same
population fall within two standard deviations of the
mean. Hence, in response to the problem posed at the
end of the previous paragraph, the following solution
can be formulated: By requiring that the norm of the
difference fall within 2 standard deviations of the
normal average voice, I have ensured that at least 75%
of the time, the algorithm would recognize a voice
correctly.
Figure 2: Detail Design Description
III. RESULTS
The performance rating of the voice recognition
technique adopted would recognize the speaker’s
voice 75% of the time of enrollment.
Figure3:Graph showing normalized frequency spectra of
recorded questioned voice sample
Figure4:Graph showing normalized frequency spectra ofaverage
templatevoice sample.
A. Performance Evaluation Index
The indexes well accepted to determine the
recognition rate of voice recognition system is
endpoint detection algorithm using Zero crossing
rates (ZCR) and Variable Frame Rates (VFR). This
techniques involves using a clean enrollment of
speech signal. The signal is recorded for 2seconds and
the testing speech is polluted by additive noise at
different noise decibel levels. The performance of the
four endpoint algorithm has been plotted in the figure
below. Three varieties of additive noise, babble noise,
and F-16 noise have been used to test. Table (1-3)
shows the accuracy rates. The additive noise has been
taken at different levels of 20dB, 15dB, 10dB, 5dB
and 0dB SNR.[15]
STEP 1
• Voice Sample Recording
STEP 2
• Voice Feature Extraction
STEP 3
• Discrete Fourier Transform
STEP 4
• Euclidean Norm
STEP 5
• Template Matching
16 Luqman Gbadamosi
www.ijorcs.org
Figure5:Factory Noise
Figure6:Babble Noise
Figure7:F-6 Noise
Table 1: Endpoint Detection (Babble Noise)
Clean 20dB 15dB 10dB 5dB 0dB
VFR 98.0 98.6 96.0 84.3 62.0 26.6
Table 2: Endpoint Detection (Babble Noise)
Clean 20dB 15dB 10dB 5dB 0dB
VFR 98.7 98.6 97.0 83.3 65.0 30.0
Table 3: Endpoint Detection (F-16 Noise)
Clean 20dB 15dB 10dB 5dB 0dB
VFR 98.7 98.6 97.0 82.0 69.6 14.0
The experimental results above was derived from
speech data collected from speaker using the different
voice recognition algorithm. clean speech was
achieved when the effect background noise and
channel distortion are minimized.
The experimental results using comparative
analysis of different algorithm for voice recognition at
different noise levels has revealed that inaccurate
endpoint detection can cause misclassification rather
than other possible errors. The accuracy of endpoint
detection is much higher for the algorithm which
integrate both time domain and frequency domain.
This has actually proven beyond any reasonable doubt
that voice recognition system using template
matching still remain the best algorithm for
recognizing an unknown voice.
IV. CONCLUSION
The above research work implementation is an
effort to understand how voice recognition is used as
one of the best forms of biometric to recognize the
identity of human being. It briefly describe all the
stages from voice recording, voice feature extraction,
discrete Fourier transform to template matching which
generate a good percentage of matching score.
Various standard technique are used at the
intermediate stage of the processing.
Low percentage verification rate arise due to the
difficulty of developing algorithm to recognize human
voice as different data are obtained for voice samples
recorded on different occasions. New technique and
highly effective algorithm have been discovered
which gives better results.
Also a major challenge is the inability of the
technique to recognize a different word phrase aside
from the one stored in the database during enrollment.
The technique adopted only recognize human voice
70% of the time. It is highly recommended that future
research work should focus on achieving up 95%
recognition rate should recognize different word
phrase.
V. REFERENCES
[1] Kinnunen, Tomi; Li, Haizhou. "An overview of text-
independent speaker recognition: From features to
super vectors". Speech Communication 52 (1): 12–40.
doi:10.1016/j.specom.2009.08.009
[2] Homayoon Beigi, “Speaker Recognition, Biometrics /
Book 1, Jucheng Yang (ed.), Intech Open Access
Publisher, 2011, pp. 3-28, ISBN 978-953-307-618-8.
doi: 10.1007/978-0-387-77592-0
[3] Duhamel, P. and M. Vetterli, "Fast Fourier
Transforms: A Tutorial Review and a State of the Art,"
Signal Processing, Vol. 19, April 1990, pp. 259-299.
doi: 10.1016/0165-1684(90)90158-U
Voice Recognition System using Template Matching 17
www.ijorcs.org
[4] Oppenheim, A. V. and R. W. Schafer, “Discrete-Time
Signal Processing”, Prentice-Hall, 1989, p. 611.
[5] Oppenheim, A. V. and R. W. Schafer, Discrete-Time
Signal Processing, Prentice-Hall, 1989, p. 619.
[6] Rader, C. M., "Discrete Fourier Transforms when the
Number of Data Samples Is Prime," Proceedings of the
IEEE, Vol. 56, June 1968 (Current Version: June
2005), pp. 1107-1108. doi: 10.1109/PROC.1968.6477
[7] Oppenheim, A. V. and R.W. Schafer. Discrete-Time
Signal Processing, Englewood Cliffs, NJ: Prentice-
Hall, 1989, pp. 311-312.
[8] ITU-T Recommendation G.711, "Pulse Code
Modulation (PCM) of Voice Frequencies," General
Aspects of Digital Transmission Systems; Terminal
Equipments, International Telecommunication Union
(ITU), 1993.
[9] Beigi, Homayoon (2011). “ Fundamentals of Speaker
Recognition.”. [Online]. Available: http://
www.wikipedia.org/wiki/speaker_recognition.
[10] Course project (Fall 2009 ) “Voice Recognition Using
MATLAB”. California State University Northridge
during the semester. [Online]. Available:
http://www.cnx.org/content/m33347/1.3/module_expor
t?format=zip
[11] “Article on Human Voice” [Online]. Available:
http://www.wikipedia.org/wiki/Human voice.
[12] “Techniques of Voice Recognition System” [Online].
Available:http://www.hitl.washington.edu/scllw/EVE/I
.D.2.d.VoiceRecognition.htm
[13] “Probability Tutorials on Chebyshevs-Inequality”
[Online]. Available:
http://www.statistics.about.com/od/
probHelpandTutorials/a/Chebyshevs-Inequality.htm.
[14] Sangram Bana, Dr. Davinder Kaur, “Fingerprint
Recognition System using Image Segmentation”.
International Journal of Advanced Engineering
Sciences and technologies Vol No. 5, Issue No. 1, 012
– 023
[15] Kapil Sharma, H.P Sinha & R.K Aggarwal
“Comparative study of speech Recognition System
using various feature extraction techniques”.
International Journal of Information Technology and
Knowledge Management Vol 3, No2, pp. 695-698
How to cite
Luqman Gbadamosi, " Voice Recognition System using Template Matching ". International Journal of Research in
Computer Science, 3 (5): pp. 13-17, September 2013. doi: 10.7815/ijorcs. 35.2013.070

Más contenido relacionado

La actualidad más candente

Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition TechnologySrijanKumar18
 
Utterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANNUtterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANNIJCSEA Journal
 
Utterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANNUtterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANNIJCSEA Journal
 
SPEECH RECOGNITION USING NEURAL NETWORK
SPEECH RECOGNITION USING NEURAL NETWORK SPEECH RECOGNITION USING NEURAL NETWORK
SPEECH RECOGNITION USING NEURAL NETWORK Kamonasish Hore
 
Utterance based speaker identification
Utterance based speaker identificationUtterance based speaker identification
Utterance based speaker identificationIJCSEA Journal
 
Deep Learning for Speech Recognition - Vikrant Singh Tomar
Deep Learning for Speech Recognition - Vikrant Singh TomarDeep Learning for Speech Recognition - Vikrant Singh Tomar
Deep Learning for Speech Recognition - Vikrant Singh TomarWithTheBest
 
Voice Recognition
Voice RecognitionVoice Recognition
Voice RecognitionAmrita More
 
Automatic speech recognition system using deep learning
Automatic speech recognition system using deep learningAutomatic speech recognition system using deep learning
Automatic speech recognition system using deep learningAnkan Dutta
 
Ai based character recognition and speech synthesis
Ai based character recognition and speech  synthesisAi based character recognition and speech  synthesis
Ai based character recognition and speech synthesisAnkita Jadhao
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech RecognitionAhmed Moawad
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech RecognitionHugo Moreno
 
Speech Recognition
Speech Recognition Speech Recognition
Speech Recognition Goa App
 
Voice recognition system
Voice recognition systemVoice recognition system
Voice recognition systemavinash raibole
 
Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceIlhaan Marwat
 
Speech recognition final presentation
Speech recognition final presentationSpeech recognition final presentation
Speech recognition final presentationhimanshubhatti
 
Ece speech-recognition-report
Ece speech-recognition-reportEce speech-recognition-report
Ece speech-recognition-reportAnakali Mahesh
 
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLAB
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLABA GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLAB
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLABsipij
 
Introduction to text to speech
Introduction to text to speechIntroduction to text to speech
Introduction to text to speechBilgin Aksoy
 

La actualidad más candente (20)

Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Speech Recognition Technology
Speech Recognition TechnologySpeech Recognition Technology
Speech Recognition Technology
 
Utterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANNUtterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANN
 
Utterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANNUtterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANN
 
SPEECH RECOGNITION USING NEURAL NETWORK
SPEECH RECOGNITION USING NEURAL NETWORK SPEECH RECOGNITION USING NEURAL NETWORK
SPEECH RECOGNITION USING NEURAL NETWORK
 
Utterance based speaker identification
Utterance based speaker identificationUtterance based speaker identification
Utterance based speaker identification
 
Deep Learning for Speech Recognition - Vikrant Singh Tomar
Deep Learning for Speech Recognition - Vikrant Singh TomarDeep Learning for Speech Recognition - Vikrant Singh Tomar
Deep Learning for Speech Recognition - Vikrant Singh Tomar
 
B034205010
B034205010B034205010
B034205010
 
Voice Recognition
Voice RecognitionVoice Recognition
Voice Recognition
 
Automatic speech recognition system using deep learning
Automatic speech recognition system using deep learningAutomatic speech recognition system using deep learning
Automatic speech recognition system using deep learning
 
Ai based character recognition and speech synthesis
Ai based character recognition and speech  synthesisAi based character recognition and speech  synthesis
Ai based character recognition and speech synthesis
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Speech Recognition
Speech RecognitionSpeech Recognition
Speech Recognition
 
Speech Recognition
Speech Recognition Speech Recognition
Speech Recognition
 
Voice recognition system
Voice recognition systemVoice recognition system
Voice recognition system
 
Speech Recognition in Artificail Inteligence
Speech Recognition in Artificail InteligenceSpeech Recognition in Artificail Inteligence
Speech Recognition in Artificail Inteligence
 
Speech recognition final presentation
Speech recognition final presentationSpeech recognition final presentation
Speech recognition final presentation
 
Ece speech-recognition-report
Ece speech-recognition-reportEce speech-recognition-report
Ece speech-recognition-report
 
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLAB
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLABA GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLAB
A GAUSSIAN MIXTURE MODEL BASED SPEECH RECOGNITION SYSTEM USING MATLAB
 
Introduction to text to speech
Introduction to text to speechIntroduction to text to speech
Introduction to text to speech
 

Destacado

Algebraic Fault Attack on the SHA-256 Compression Function
Algebraic Fault Attack on the SHA-256 Compression FunctionAlgebraic Fault Attack on the SHA-256 Compression Function
Algebraic Fault Attack on the SHA-256 Compression FunctionIJORCS
 
Using Virtualization Technique to Increase Security and Reduce Energy Consump...
Using Virtualization Technique to Increase Security and Reduce Energy Consump...Using Virtualization Technique to Increase Security and Reduce Energy Consump...
Using Virtualization Technique to Increase Security and Reduce Energy Consump...IJORCS
 
FPGA Implementation of FIR Filter using Various Algorithms: A Retrospective
FPGA Implementation of FIR Filter using Various Algorithms: A RetrospectiveFPGA Implementation of FIR Filter using Various Algorithms: A Retrospective
FPGA Implementation of FIR Filter using Various Algorithms: A RetrospectiveIJORCS
 
Real-Time Multiple License Plate Recognition System
Real-Time Multiple License Plate Recognition SystemReal-Time Multiple License Plate Recognition System
Real-Time Multiple License Plate Recognition SystemIJORCS
 
Help the Genetic Algorithm to Minimize the Urban Traffic on Intersections
Help the Genetic Algorithm to Minimize the Urban Traffic on IntersectionsHelp the Genetic Algorithm to Minimize the Urban Traffic on Intersections
Help the Genetic Algorithm to Minimize the Urban Traffic on IntersectionsIJORCS
 
Call for Papers - IJORCS, Volume 4 Issue 4
Call for Papers - IJORCS, Volume 4 Issue 4Call for Papers - IJORCS, Volume 4 Issue 4
Call for Papers - IJORCS, Volume 4 Issue 4IJORCS
 
Industrial Energy Management and the Emerging ISO 50001 Standard
Industrial Energy Management and the Emerging ISO 50001 StandardIndustrial Energy Management and the Emerging ISO 50001 Standard
Industrial Energy Management and the Emerging ISO 50001 StandardSchneider Electric
 

Destacado (7)

Algebraic Fault Attack on the SHA-256 Compression Function
Algebraic Fault Attack on the SHA-256 Compression FunctionAlgebraic Fault Attack on the SHA-256 Compression Function
Algebraic Fault Attack on the SHA-256 Compression Function
 
Using Virtualization Technique to Increase Security and Reduce Energy Consump...
Using Virtualization Technique to Increase Security and Reduce Energy Consump...Using Virtualization Technique to Increase Security and Reduce Energy Consump...
Using Virtualization Technique to Increase Security and Reduce Energy Consump...
 
FPGA Implementation of FIR Filter using Various Algorithms: A Retrospective
FPGA Implementation of FIR Filter using Various Algorithms: A RetrospectiveFPGA Implementation of FIR Filter using Various Algorithms: A Retrospective
FPGA Implementation of FIR Filter using Various Algorithms: A Retrospective
 
Real-Time Multiple License Plate Recognition System
Real-Time Multiple License Plate Recognition SystemReal-Time Multiple License Plate Recognition System
Real-Time Multiple License Plate Recognition System
 
Help the Genetic Algorithm to Minimize the Urban Traffic on Intersections
Help the Genetic Algorithm to Minimize the Urban Traffic on IntersectionsHelp the Genetic Algorithm to Minimize the Urban Traffic on Intersections
Help the Genetic Algorithm to Minimize the Urban Traffic on Intersections
 
Call for Papers - IJORCS, Volume 4 Issue 4
Call for Papers - IJORCS, Volume 4 Issue 4Call for Papers - IJORCS, Volume 4 Issue 4
Call for Papers - IJORCS, Volume 4 Issue 4
 
Industrial Energy Management and the Emerging ISO 50001 Standard
Industrial Energy Management and the Emerging ISO 50001 StandardIndustrial Energy Management and the Emerging ISO 50001 Standard
Industrial Energy Management and the Emerging ISO 50001 Standard
 

Similar a Voice Recognition System using Template Matching

Speech to text conversion for visually impaired person using µ law companding
Speech to text conversion for visually impaired person using µ law compandingSpeech to text conversion for visually impaired person using µ law companding
Speech to text conversion for visually impaired person using µ law compandingiosrjce
 
Simulation of speech recognition using correlation method on matlab software
Simulation of speech recognition using correlation method on matlab softwareSimulation of speech recognition using correlation method on matlab software
Simulation of speech recognition using correlation method on matlab softwareVaishaliVaishali14
 
Effect of Dynamic Time Warping on Alignment of Phrases and Phonemes
Effect of Dynamic Time Warping on Alignment of Phrases and PhonemesEffect of Dynamic Time Warping on Alignment of Phrases and Phonemes
Effect of Dynamic Time Warping on Alignment of Phrases and Phonemeskevig
 
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMES
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMESEFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMES
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMESkevig
 
Speaker recognition in android
Speaker recognition in androidSpeaker recognition in android
Speaker recognition in androidAnshuli Mittal
 
Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189Editor IJARCET
 
Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189Editor IJARCET
 
Artificial Intelligence- An Introduction
Artificial Intelligence- An IntroductionArtificial Intelligence- An Introduction
Artificial Intelligence- An Introductionacemindia
 
Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction acemindia
 
Speaker recognition in android
Speaker recognition in androidSpeaker recognition in android
Speaker recognition in androidAnshuli Mittal
 
An Introduction to Various Features of Speech SignalSpeech features
An Introduction to Various Features of Speech SignalSpeech featuresAn Introduction to Various Features of Speech SignalSpeech features
An Introduction to Various Features of Speech SignalSpeech featuresSivaranjan Goswami
 
Dynamic Audio-Visual Client Recognition modelling
Dynamic Audio-Visual Client Recognition modellingDynamic Audio-Visual Client Recognition modelling
Dynamic Audio-Visual Client Recognition modellingCSCJournals
 
AI for voice recognition.pptx
AI for voice recognition.pptxAI for voice recognition.pptx
AI for voice recognition.pptxJhalakDashora
 

Similar a Voice Recognition System using Template Matching (20)

Voice
VoiceVoice
Voice
 
De4201715719
De4201715719De4201715719
De4201715719
 
Kc3517481754
Kc3517481754Kc3517481754
Kc3517481754
 
Ijetcas14 426
Ijetcas14 426Ijetcas14 426
Ijetcas14 426
 
H010625862
H010625862H010625862
H010625862
 
Speech to text conversion for visually impaired person using µ law companding
Speech to text conversion for visually impaired person using µ law compandingSpeech to text conversion for visually impaired person using µ law companding
Speech to text conversion for visually impaired person using µ law companding
 
H42045359
H42045359H42045359
H42045359
 
Simulation of speech recognition using correlation method on matlab software
Simulation of speech recognition using correlation method on matlab softwareSimulation of speech recognition using correlation method on matlab software
Simulation of speech recognition using correlation method on matlab software
 
Effect of Dynamic Time Warping on Alignment of Phrases and Phonemes
Effect of Dynamic Time Warping on Alignment of Phrases and PhonemesEffect of Dynamic Time Warping on Alignment of Phrases and Phonemes
Effect of Dynamic Time Warping on Alignment of Phrases and Phonemes
 
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMES
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMESEFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMES
EFFECT OF DYNAMIC TIME WARPING ON ALIGNMENT OF PHRASES AND PHONEMES
 
Speaker recognition in android
Speaker recognition in androidSpeaker recognition in android
Speaker recognition in android
 
Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189
 
Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189
 
Artificial Intelligence- An Introduction
Artificial Intelligence- An IntroductionArtificial Intelligence- An Introduction
Artificial Intelligence- An Introduction
 
Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction Artificial Intelligence - An Introduction
Artificial Intelligence - An Introduction
 
Speaker recognition in android
Speaker recognition in androidSpeaker recognition in android
Speaker recognition in android
 
An Introduction to Various Features of Speech SignalSpeech features
An Introduction to Various Features of Speech SignalSpeech featuresAn Introduction to Various Features of Speech SignalSpeech features
An Introduction to Various Features of Speech SignalSpeech features
 
Dynamic Audio-Visual Client Recognition modelling
Dynamic Audio-Visual Client Recognition modellingDynamic Audio-Visual Client Recognition modelling
Dynamic Audio-Visual Client Recognition modelling
 
Assign
AssignAssign
Assign
 
AI for voice recognition.pptx
AI for voice recognition.pptxAI for voice recognition.pptx
AI for voice recognition.pptx
 

Más de IJORCS

Enhancement of DES Algorithm with Multi State Logic
Enhancement of DES Algorithm with Multi State LogicEnhancement of DES Algorithm with Multi State Logic
Enhancement of DES Algorithm with Multi State LogicIJORCS
 
Hybrid Simulated Annealing and Nelder-Mead Algorithm for Solving Large-Scale ...
Hybrid Simulated Annealing and Nelder-Mead Algorithm for Solving Large-Scale ...Hybrid Simulated Annealing and Nelder-Mead Algorithm for Solving Large-Scale ...
Hybrid Simulated Annealing and Nelder-Mead Algorithm for Solving Large-Scale ...IJORCS
 
CFP. IJORCS, Volume 4 - Issue2
CFP. IJORCS, Volume 4 - Issue2CFP. IJORCS, Volume 4 - Issue2
CFP. IJORCS, Volume 4 - Issue2IJORCS
 
Call for Papers - IJORCS - Vol 4, Issue 1
Call for Papers - IJORCS - Vol 4, Issue 1Call for Papers - IJORCS - Vol 4, Issue 1
Call for Papers - IJORCS - Vol 4, Issue 1IJORCS
 
Channel Aware Mac Protocol for Maximizing Throughput and Fairness
Channel Aware Mac Protocol for Maximizing Throughput and FairnessChannel Aware Mac Protocol for Maximizing Throughput and Fairness
Channel Aware Mac Protocol for Maximizing Throughput and FairnessIJORCS
 
A Review and Analysis on Mobile Application Development Processes using Agile...
A Review and Analysis on Mobile Application Development Processes using Agile...A Review and Analysis on Mobile Application Development Processes using Agile...
A Review and Analysis on Mobile Application Development Processes using Agile...IJORCS
 
Congestion Prediction and Adaptive Rate Adjustment Technique for Wireless Sen...
Congestion Prediction and Adaptive Rate Adjustment Technique for Wireless Sen...Congestion Prediction and Adaptive Rate Adjustment Technique for Wireless Sen...
Congestion Prediction and Adaptive Rate Adjustment Technique for Wireless Sen...IJORCS
 
A Study of Routing Techniques in Intermittently Connected MANETs
A Study of Routing Techniques in Intermittently Connected MANETsA Study of Routing Techniques in Intermittently Connected MANETs
A Study of Routing Techniques in Intermittently Connected MANETsIJORCS
 
Improving the Efficiency of Spectral Subtraction Method by Combining it with ...
Improving the Efficiency of Spectral Subtraction Method by Combining it with ...Improving the Efficiency of Spectral Subtraction Method by Combining it with ...
Improving the Efficiency of Spectral Subtraction Method by Combining it with ...IJORCS
 
An Adaptive Load Sharing Algorithm for Heterogeneous Distributed System
An Adaptive Load Sharing Algorithm for Heterogeneous Distributed SystemAn Adaptive Load Sharing Algorithm for Heterogeneous Distributed System
An Adaptive Load Sharing Algorithm for Heterogeneous Distributed SystemIJORCS
 
The Design of Cognitive Social Simulation Framework using Statistical Methodo...
The Design of Cognitive Social Simulation Framework using Statistical Methodo...The Design of Cognitive Social Simulation Framework using Statistical Methodo...
The Design of Cognitive Social Simulation Framework using Statistical Methodo...IJORCS
 
An Enhanced Framework for Improving Spatio-Temporal Queries for Global Positi...
An Enhanced Framework for Improving Spatio-Temporal Queries for Global Positi...An Enhanced Framework for Improving Spatio-Temporal Queries for Global Positi...
An Enhanced Framework for Improving Spatio-Temporal Queries for Global Positi...IJORCS
 
A PSO-Based Subtractive Data Clustering Algorithm
A PSO-Based Subtractive Data Clustering AlgorithmA PSO-Based Subtractive Data Clustering Algorithm
A PSO-Based Subtractive Data Clustering AlgorithmIJORCS
 
Call for papers, IJORCS, Volume 3 - Issue 3
Call for papers, IJORCS, Volume 3 - Issue 3Call for papers, IJORCS, Volume 3 - Issue 3
Call for papers, IJORCS, Volume 3 - Issue 3IJORCS
 
Dynamic Map and Diffserv Based AR Selection for Handoff in HMIPv6 Networks
Dynamic Map and Diffserv Based AR Selection for Handoff in HMIPv6 Networks Dynamic Map and Diffserv Based AR Selection for Handoff in HMIPv6 Networks
Dynamic Map and Diffserv Based AR Selection for Handoff in HMIPv6 Networks IJORCS
 
From Physical to Virtual Wireless Sensor Networks using Cloud Computing
From Physical to Virtual Wireless Sensor Networks using Cloud Computing From Physical to Virtual Wireless Sensor Networks using Cloud Computing
From Physical to Virtual Wireless Sensor Networks using Cloud Computing IJORCS
 
Prediction of Atmospheric Pressure at Ground Level using Artificial Neural Ne...
Prediction of Atmospheric Pressure at Ground Level using Artificial Neural Ne...Prediction of Atmospheric Pressure at Ground Level using Artificial Neural Ne...
Prediction of Atmospheric Pressure at Ground Level using Artificial Neural Ne...IJORCS
 
Ant Colony with Colored Pheromones Routing for Multi Objectives Quality of Se...
Ant Colony with Colored Pheromones Routing for Multi Objectives Quality of Se...Ant Colony with Colored Pheromones Routing for Multi Objectives Quality of Se...
Ant Colony with Colored Pheromones Routing for Multi Objectives Quality of Se...IJORCS
 
Design a New Image Encryption using Fuzzy Integral Permutation with Coupled C...
Design a New Image Encryption using Fuzzy Integral Permutation with Coupled C...Design a New Image Encryption using Fuzzy Integral Permutation with Coupled C...
Design a New Image Encryption using Fuzzy Integral Permutation with Coupled C...IJORCS
 
Can “Feature” be used to Model the Changing Access Control Policies?
Can “Feature” be used to Model the Changing Access Control Policies? Can “Feature” be used to Model the Changing Access Control Policies?
Can “Feature” be used to Model the Changing Access Control Policies? IJORCS
 

Más de IJORCS (20)

Enhancement of DES Algorithm with Multi State Logic
Enhancement of DES Algorithm with Multi State LogicEnhancement of DES Algorithm with Multi State Logic
Enhancement of DES Algorithm with Multi State Logic
 
Hybrid Simulated Annealing and Nelder-Mead Algorithm for Solving Large-Scale ...
Hybrid Simulated Annealing and Nelder-Mead Algorithm for Solving Large-Scale ...Hybrid Simulated Annealing and Nelder-Mead Algorithm for Solving Large-Scale ...
Hybrid Simulated Annealing and Nelder-Mead Algorithm for Solving Large-Scale ...
 
CFP. IJORCS, Volume 4 - Issue2
CFP. IJORCS, Volume 4 - Issue2CFP. IJORCS, Volume 4 - Issue2
CFP. IJORCS, Volume 4 - Issue2
 
Call for Papers - IJORCS - Vol 4, Issue 1
Call for Papers - IJORCS - Vol 4, Issue 1Call for Papers - IJORCS - Vol 4, Issue 1
Call for Papers - IJORCS - Vol 4, Issue 1
 
Channel Aware Mac Protocol for Maximizing Throughput and Fairness
Channel Aware Mac Protocol for Maximizing Throughput and FairnessChannel Aware Mac Protocol for Maximizing Throughput and Fairness
Channel Aware Mac Protocol for Maximizing Throughput and Fairness
 
A Review and Analysis on Mobile Application Development Processes using Agile...
A Review and Analysis on Mobile Application Development Processes using Agile...A Review and Analysis on Mobile Application Development Processes using Agile...
A Review and Analysis on Mobile Application Development Processes using Agile...
 
Congestion Prediction and Adaptive Rate Adjustment Technique for Wireless Sen...
Congestion Prediction and Adaptive Rate Adjustment Technique for Wireless Sen...Congestion Prediction and Adaptive Rate Adjustment Technique for Wireless Sen...
Congestion Prediction and Adaptive Rate Adjustment Technique for Wireless Sen...
 
A Study of Routing Techniques in Intermittently Connected MANETs
A Study of Routing Techniques in Intermittently Connected MANETsA Study of Routing Techniques in Intermittently Connected MANETs
A Study of Routing Techniques in Intermittently Connected MANETs
 
Improving the Efficiency of Spectral Subtraction Method by Combining it with ...
Improving the Efficiency of Spectral Subtraction Method by Combining it with ...Improving the Efficiency of Spectral Subtraction Method by Combining it with ...
Improving the Efficiency of Spectral Subtraction Method by Combining it with ...
 
An Adaptive Load Sharing Algorithm for Heterogeneous Distributed System
An Adaptive Load Sharing Algorithm for Heterogeneous Distributed SystemAn Adaptive Load Sharing Algorithm for Heterogeneous Distributed System
An Adaptive Load Sharing Algorithm for Heterogeneous Distributed System
 
The Design of Cognitive Social Simulation Framework using Statistical Methodo...
The Design of Cognitive Social Simulation Framework using Statistical Methodo...The Design of Cognitive Social Simulation Framework using Statistical Methodo...
The Design of Cognitive Social Simulation Framework using Statistical Methodo...
 
An Enhanced Framework for Improving Spatio-Temporal Queries for Global Positi...
An Enhanced Framework for Improving Spatio-Temporal Queries for Global Positi...An Enhanced Framework for Improving Spatio-Temporal Queries for Global Positi...
An Enhanced Framework for Improving Spatio-Temporal Queries for Global Positi...
 
A PSO-Based Subtractive Data Clustering Algorithm
A PSO-Based Subtractive Data Clustering AlgorithmA PSO-Based Subtractive Data Clustering Algorithm
A PSO-Based Subtractive Data Clustering Algorithm
 
Call for papers, IJORCS, Volume 3 - Issue 3
Call for papers, IJORCS, Volume 3 - Issue 3Call for papers, IJORCS, Volume 3 - Issue 3
Call for papers, IJORCS, Volume 3 - Issue 3
 
Dynamic Map and Diffserv Based AR Selection for Handoff in HMIPv6 Networks
Dynamic Map and Diffserv Based AR Selection for Handoff in HMIPv6 Networks Dynamic Map and Diffserv Based AR Selection for Handoff in HMIPv6 Networks
Dynamic Map and Diffserv Based AR Selection for Handoff in HMIPv6 Networks
 
From Physical to Virtual Wireless Sensor Networks using Cloud Computing
From Physical to Virtual Wireless Sensor Networks using Cloud Computing From Physical to Virtual Wireless Sensor Networks using Cloud Computing
From Physical to Virtual Wireless Sensor Networks using Cloud Computing
 
Prediction of Atmospheric Pressure at Ground Level using Artificial Neural Ne...
Prediction of Atmospheric Pressure at Ground Level using Artificial Neural Ne...Prediction of Atmospheric Pressure at Ground Level using Artificial Neural Ne...
Prediction of Atmospheric Pressure at Ground Level using Artificial Neural Ne...
 
Ant Colony with Colored Pheromones Routing for Multi Objectives Quality of Se...
Ant Colony with Colored Pheromones Routing for Multi Objectives Quality of Se...Ant Colony with Colored Pheromones Routing for Multi Objectives Quality of Se...
Ant Colony with Colored Pheromones Routing for Multi Objectives Quality of Se...
 
Design a New Image Encryption using Fuzzy Integral Permutation with Coupled C...
Design a New Image Encryption using Fuzzy Integral Permutation with Coupled C...Design a New Image Encryption using Fuzzy Integral Permutation with Coupled C...
Design a New Image Encryption using Fuzzy Integral Permutation with Coupled C...
 
Can “Feature” be used to Model the Changing Access Control Policies?
Can “Feature” be used to Model the Changing Access Control Policies? Can “Feature” be used to Model the Changing Access Control Policies?
Can “Feature” be used to Model the Changing Access Control Policies?
 

Último

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 

Último (20)

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 

Voice Recognition System using Template Matching

  • 1. International Journal of Research in Computer Science eISSN 2249-8265 Volume 3 Issue 5 (2013) pp. 13-17 www.ijorcs.org, A Unit of White Globe Publications doi: 10.7815/ijorcs. 35.2013.070 www.ijorcs.org VOICE RECOGNITION SYSTEM USING TEMPLATE MATCHING Luqman Gbadamosi Computer Science Department, Lagos State Polytechnic, Lagos, Nigeria Email: luqmangbadamosi@yahoo.com Abstract: It is easy for human to recognize familiar voice but using computer programs to identify a voice when compared with others is a herculean task. This is due to the problem that is encountered when developing the algorithm to recognize human voice. It is impossible to say a word the same way in two different occasions. Human speech analysis by computer gives different interpretation based on varying speed of speech delivery. This research paper gives detail description of the process behind implementation of an effective voice recognition algorithm. The algorithm utilize discrete Fourier transform to compare the frequency spectra of two voice samples because it remained unchanged as speech is slightly varied. Chebyshev inequality is then used to determine whether the two voices came from the same person. The algorithm is implemented and tested using MATLAB. Keywords: chebyshev’s inequality, discrete fourier transform, frequency spectra, voice recognition. I. INTRODUCTION Voice Recognition or Voice Authentication is an automated method of identification of the person who is speaking by the characteristics of their voice biometrics. Voice is one of many forms of biometrics used to identify an individual and verify their identity. Naturally human can recognize a familiar voice but getting computer to do the same is more difficult task. This is due to the fact that it is impossible to say a word exactly the same way on two different occasions. Advancement in computing capabilities has led to a more effective way of recognizing human voice using feature extraction. Voice recognition system is one of the best and highly effective biometrics technique which could be used for telephone banking and forensic investigation by law enforcement agency. [9][10] A. What is Human Voice? The voice is made up of sound made by human being using vocal folds for talking, singing, laughing, crying, screaming etc. The human voice is specifically that part of human sound production in which the vocal folds are the primary sound source. The mechanism for generating the human voice can be subdivided into three; the lungs, the vocal folds within the larynx, and the articulators. [11] Figure1:Thespectrogramofhuman voicerevealsits reach harmonic content. B. What is Voice Recognition? Voice Recognition (sometimes referred to as Speaker Recognition) is the identification of the person who is speaking by extracting the feature of their voices when a questioned voice print is compared against a known voice print. This technology involves sounds, words or phrases spoken by humans are converted into electrical signals, and these signals are transformed into coding patterns to which meaning has been assigned. There are two major applications of voice recognition technologies and methodologies. The first is voice verification or authentication which is used to verify the speaker claims to be of a certain identity and the voice is used to verify this claim. The second is voice identification which is the task of determining an unknown speaker’s identity. In a better perspective, voice verification is one to one matching where one speaker’s voice is matched to one template or voice print, whereas voice identification is one to many matching where the speaker’s voice is compared against many voice templates. Speaker recognition system has two phases: Enrollment and Verification. During enrollment, the speaker’s voice is recorded and typically a number of features are extracted to form a voice print or template. In the verification phase, a speech sample or “utterance” is compared against a previously created
  • 2. 14 Luqman Gbadamosi www.ijorcs.org voice print. For identification systems, the utterance is compared against multiple voice prints in order to determine the best match while verification systems compare an utterance against a single voice print. Voice Recognition Systems can also be categorized into two: text independent and text dependent. [9] Text-Dependent: This means text must be the same for the enrollment and verification. The use of shared- secret passwords and PINs or knowledge-based information can be employed in order to create a multi-factor authentication scenario. Text Independent: Text-Independent systems are most often used for speaker identification as they require very little cooperation by the speaker. In this case the text used during enrollment is different from the text during verification. In fact, the enrollment may happen without the user’s knowledge, as in the case for many forensic applications. [9] C. Voice Recognition Techniques The most common approaches to voice recognition can be divided into two classes: Template Matching and Feature Analysis. Template Matching: Template matching is the simplest technique and has the highest accuracy when used properly, but it also suffers from the most limitations. As with any approach to voice recognition, the first step is for the user to speak a word or phrase into a microphone. The electrical signal from the microphone is digitized by an "analog-to-digital (A/D) converter", and is stored in memory. To determine the "meaning" of this voice input, the computer attempts to match the input with a digitized voice sample, or template that has a known meaning. This technique is a close analogy to the traditional command inputs from a keyboard. The program contains the input template, and attempts to match this template with the actual input using a simple conditional statement. This type of system is known as "speaker dependent." and recognition accuracy can be about 98 percent. Feature Analysis: A more general form of voice recognition is available through feature analysis and this technique usually leads to "speaker-independent" voice recognition. Instead of trying to find an exact or near-exact match between the actual voice input and a previously stored voice template, this method first processes the voice input using "Fourier transforms" or "linear predictive coding (LPC)", then attempts to find characteristic similarities between the expected inputs and the actual digitized voice input. These similarities will be present for a wide range of speakers, and so the system need not be trained by each new user. The types of speech differences that the speaker-independent method can deal with, but which pattern matching would fail to handle, include accents, and varying speed of delivery, pitch, volume, and inflection. Speaker-independent speech recognition has proven to be very difficult, with some of the greatest hurdles being the variety of accents and inflections used by speakers of different nationalities. Recognition accuracy for speaker-independent systems is somewhat less than for speaker-dependent systems, usually between 90 and 95 percent. [12] I have implemented template matching technique. This approach has been intensively studied and is also the back bone of most voice recognition products in the market. II. IMPLEMENTATION A. Design Description The voice recognition system using template matching technique require the user to first create a template for matching comparison by first recording 10 samples of the speaker’s voice by calling a phrase which is going to be the known voice. Thereafter, the questioned speaker’s voice can now be recorded which would now be further analyzed using Discrete Fourier Transform. Discrete Fourier Transform: Voice recognition in time domain would be extremely be impractical based on the difficulties explained above. Instead an analysis in frequency spectra in a voice which remain predominately unchanged as speech is slightly varied turn out to be a more viable option. The conversion of all the recording into frequency domain is done using discrete Fourier transform greatly simplified the process of comparing two recordings. [3][6] Finding the Norm: Due to the nature of human speech all the data pertaining to frequency above 600Hz is safely discarded. Therefore, once a recording is converted into frequency domain, it could then be simply regarded as a vector in 600-dimensional Euclidean space. At this point, a comparison between two vectors could easily be carried out by normalizing the vectors (giving them length 1) then computing the norm of the difference between the two (of course, the difference between two vectors in R600 is performed by subtracting component wise). Unfortunately, exactly which norm to use is not immediately clear? After carefully comparing and contrasting the use of the Taxicab, Euclidean, and Maximum norms.[13] It became clear that the Euclidean norm most accurately measured the closeness between different frequency spectra. Once the norm function was chosen, all that remained was to decide exactly how small the norm of the difference of two vectors had to be in order to determine that both recordings originated from the same person.
  • 3. Voice Recognition System using Template Matching 15 www.ijorcs.org Chebyshev's Inequality: Chebyshev’s inequality says that at least 1-1/K2 of data from a sample must fall within K standard deviations from the mean, where K is any positive real number greater than one. To illustrate the inequality, we will look at it for a few values of K: − For K = 2 we have 1 – 1/K2 = 1 - 1/4 = 3/4 = 75%. So Chebyshev’s inequality says that at least 75% of the data values of any distribution must be within two standard deviations of the mean. − For K = 3 we have 1 – 1/K2 = 1 - 1/9 = 8/9 = 89%. So Chebyshev’s inequality says that at least 89% of the data values of any distribution must be within three standard deviations of the mean. − For K = 4 we have 1 – 1/K2 = 1 - 1/16 = 15/16 = 93.75%. So Chebyshev’s inequality says that at least 93.75% of the data values of any distribution must be within four standard deviations of the mean.[13] Template Matching: The above analysis has revealed that Chebyshe v's Inequality states that in particular, at least 3/4 of all measurements from the same population fall within two standard deviations of the mean. Hence, in response to the problem posed at the end of the previous paragraph, the following solution can be formulated: By requiring that the norm of the difference fall within 2 standard deviations of the normal average voice, I have ensured that at least 75% of the time, the algorithm would recognize a voice correctly. Figure 2: Detail Design Description III. RESULTS The performance rating of the voice recognition technique adopted would recognize the speaker’s voice 75% of the time of enrollment. Figure3:Graph showing normalized frequency spectra of recorded questioned voice sample Figure4:Graph showing normalized frequency spectra ofaverage templatevoice sample. A. Performance Evaluation Index The indexes well accepted to determine the recognition rate of voice recognition system is endpoint detection algorithm using Zero crossing rates (ZCR) and Variable Frame Rates (VFR). This techniques involves using a clean enrollment of speech signal. The signal is recorded for 2seconds and the testing speech is polluted by additive noise at different noise decibel levels. The performance of the four endpoint algorithm has been plotted in the figure below. Three varieties of additive noise, babble noise, and F-16 noise have been used to test. Table (1-3) shows the accuracy rates. The additive noise has been taken at different levels of 20dB, 15dB, 10dB, 5dB and 0dB SNR.[15] STEP 1 • Voice Sample Recording STEP 2 • Voice Feature Extraction STEP 3 • Discrete Fourier Transform STEP 4 • Euclidean Norm STEP 5 • Template Matching
  • 4. 16 Luqman Gbadamosi www.ijorcs.org Figure5:Factory Noise Figure6:Babble Noise Figure7:F-6 Noise Table 1: Endpoint Detection (Babble Noise) Clean 20dB 15dB 10dB 5dB 0dB VFR 98.0 98.6 96.0 84.3 62.0 26.6 Table 2: Endpoint Detection (Babble Noise) Clean 20dB 15dB 10dB 5dB 0dB VFR 98.7 98.6 97.0 83.3 65.0 30.0 Table 3: Endpoint Detection (F-16 Noise) Clean 20dB 15dB 10dB 5dB 0dB VFR 98.7 98.6 97.0 82.0 69.6 14.0 The experimental results above was derived from speech data collected from speaker using the different voice recognition algorithm. clean speech was achieved when the effect background noise and channel distortion are minimized. The experimental results using comparative analysis of different algorithm for voice recognition at different noise levels has revealed that inaccurate endpoint detection can cause misclassification rather than other possible errors. The accuracy of endpoint detection is much higher for the algorithm which integrate both time domain and frequency domain. This has actually proven beyond any reasonable doubt that voice recognition system using template matching still remain the best algorithm for recognizing an unknown voice. IV. CONCLUSION The above research work implementation is an effort to understand how voice recognition is used as one of the best forms of biometric to recognize the identity of human being. It briefly describe all the stages from voice recording, voice feature extraction, discrete Fourier transform to template matching which generate a good percentage of matching score. Various standard technique are used at the intermediate stage of the processing. Low percentage verification rate arise due to the difficulty of developing algorithm to recognize human voice as different data are obtained for voice samples recorded on different occasions. New technique and highly effective algorithm have been discovered which gives better results. Also a major challenge is the inability of the technique to recognize a different word phrase aside from the one stored in the database during enrollment. The technique adopted only recognize human voice 70% of the time. It is highly recommended that future research work should focus on achieving up 95% recognition rate should recognize different word phrase. V. REFERENCES [1] Kinnunen, Tomi; Li, Haizhou. "An overview of text- independent speaker recognition: From features to super vectors". Speech Communication 52 (1): 12–40. doi:10.1016/j.specom.2009.08.009 [2] Homayoon Beigi, “Speaker Recognition, Biometrics / Book 1, Jucheng Yang (ed.), Intech Open Access Publisher, 2011, pp. 3-28, ISBN 978-953-307-618-8. doi: 10.1007/978-0-387-77592-0 [3] Duhamel, P. and M. Vetterli, "Fast Fourier Transforms: A Tutorial Review and a State of the Art," Signal Processing, Vol. 19, April 1990, pp. 259-299. doi: 10.1016/0165-1684(90)90158-U
  • 5. Voice Recognition System using Template Matching 17 www.ijorcs.org [4] Oppenheim, A. V. and R. W. Schafer, “Discrete-Time Signal Processing”, Prentice-Hall, 1989, p. 611. [5] Oppenheim, A. V. and R. W. Schafer, Discrete-Time Signal Processing, Prentice-Hall, 1989, p. 619. [6] Rader, C. M., "Discrete Fourier Transforms when the Number of Data Samples Is Prime," Proceedings of the IEEE, Vol. 56, June 1968 (Current Version: June 2005), pp. 1107-1108. doi: 10.1109/PROC.1968.6477 [7] Oppenheim, A. V. and R.W. Schafer. Discrete-Time Signal Processing, Englewood Cliffs, NJ: Prentice- Hall, 1989, pp. 311-312. [8] ITU-T Recommendation G.711, "Pulse Code Modulation (PCM) of Voice Frequencies," General Aspects of Digital Transmission Systems; Terminal Equipments, International Telecommunication Union (ITU), 1993. [9] Beigi, Homayoon (2011). “ Fundamentals of Speaker Recognition.”. [Online]. Available: http:// www.wikipedia.org/wiki/speaker_recognition. [10] Course project (Fall 2009 ) “Voice Recognition Using MATLAB”. California State University Northridge during the semester. [Online]. Available: http://www.cnx.org/content/m33347/1.3/module_expor t?format=zip [11] “Article on Human Voice” [Online]. Available: http://www.wikipedia.org/wiki/Human voice. [12] “Techniques of Voice Recognition System” [Online]. Available:http://www.hitl.washington.edu/scllw/EVE/I .D.2.d.VoiceRecognition.htm [13] “Probability Tutorials on Chebyshevs-Inequality” [Online]. Available: http://www.statistics.about.com/od/ probHelpandTutorials/a/Chebyshevs-Inequality.htm. [14] Sangram Bana, Dr. Davinder Kaur, “Fingerprint Recognition System using Image Segmentation”. International Journal of Advanced Engineering Sciences and technologies Vol No. 5, Issue No. 1, 012 – 023 [15] Kapil Sharma, H.P Sinha & R.K Aggarwal “Comparative study of speech Recognition System using various feature extraction techniques”. International Journal of Information Technology and Knowledge Management Vol 3, No2, pp. 695-698 How to cite Luqman Gbadamosi, " Voice Recognition System using Template Matching ". International Journal of Research in Computer Science, 3 (5): pp. 13-17, September 2013. doi: 10.7815/ijorcs. 35.2013.070