Speaker recognition system by abhishek mahajan

SHREEJEE INSTITUTE OF
TECHNOLOGY AND MANAGEMENT
Speaker Recognition
• Guided By:- Mr. Prakash
Singh Panwar
• By:- Rajpal Singh Chouhan
• EC BRANCH 1ST YEAR

What is Speaker Recognition?
Speaker Recognition is the process of automatically
recognizing who is speaking on the basis of individual
information included in speech signals.
Speaker Recognition
=
Speaker Identification,
Speaker Verification

Speaker Identification
• a
Whose voice is
this?
?
?
??

• a
• Synonyms: authentication, detection.
• User claims an identity.
• System task: Accept or reject identity claim.
Is this Ahmad’s
voice ?
?

Model of Speaker Recognizer
• a
Fig -1 : Simple model of Speaker Recognizer .
U Permitted
to Access
Hello,
Mr. John

The Structure of Speaker
Recognizer• a
• Figure 2 :Functional Scheme of an ASR System.
Feature
Extraction Feature Vector
Training Mode
Recognition
Speaker
Modeling
Classification
Decision Logic
Speaker #ID
Speaker_1

Speech Signal Analysis
Feature Extraction
• a
• - The aim is to extract the voice features to
distinguish different phonemes of a language.
5
1
5
6
4
5
4
6
5
1
5
6
1
5
6
1
6
5
1
5
6
4
5
6
4
5
4
2
5
1
5
6
1
5
6
5

MFCC extraction
• a
Pre-emphasis DFT
Mel filter
banks
Log(||2) IDFT
Speech
signal
x(n)
WINDOW
x’(n)
xt (n)
Xt(k)
Yt(m)
MFCC
yt
(m)(k)
MFCC means Mel-frequency cepstral coefficients that
representation of the short-term power spectrum of a sound for
audio processing.
The MFCCs are the amplitudes of the resulting spectrum.

a
• a
Speech waveform of a
phoneme “ae”
After pre-emphasis and
Hamming windowing
Power spectrum MFCC

Speech Signal to Feature Vector
• a
5
1
5
6
4
5
4
6
5
1
5
6
1
5
6
1
6
5
1
5
6
4
5
6
4
5
4
2
5
1
5
6
1
5
6
5

Vector Quantization (VQ)
• aAIM of VQ :
representation of large amounts
of data by (few) prototype vectors.
example:
identification and grouping
in clusters of similar data.
assignment of feature vector 
to the closest prototype w
(similarity or distance measure,
e.g. Euclidean distance )

Database Creation Process
• a
Database
Speaker #1
Speaker #2
Speaker #3
Hello, Speaker #1
Speaker #1Speaker #2
Hello, Speaker #2

Speaker Identification
• a
Database
#1 #2 #3
Speaker
# ?
Speaker
# 1

• a
Database
#1 #2 #3
Speaker
# 1Accept
14

Database Creation Condition
• a
Table 1: Database description.
Parameter Characteristics
Language Bangla
No. of speaker 5
Speech type Sentence reading
Recording condition A normal room condition
Audio Length 60-90 seconds
Audio type Stereo
Sample Format 16-bit PCM
Sampling Frequency 8 KHz
Bit Rate 1411 kbps

Speaker Recognition Result
• a
Table 3: Test result for speaker recognition system.
Speaker No. of input Correct Incorrect Accuracy
Speaker_1 5 5 0 100%
Speaker_2 9 8 1 88.88%
Speaker_3 6 6 0 100%
Speaker_3 12 11 1 91.67%
Speaker_4 8 8 0 100%
Speaker_5 10 10 0 100%
Total Speaker 50 48 2 96%

Applications
• a
• Transaction authentication
– Toll fraud prevention
– Telephone credit card purchases
– Telephone brokerage (e.g., stock trading)
• Access control
– Physical facilities
– Computers and data networks
• Information retrieval
– Customer information for call centers
– Audio indexing (speech skimming device)
• Forensics
– Voice sample matching

Speaker recognition system by abhishek mahajan

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (10)

Similar a Speaker recognition system by abhishek mahajan

Similar a Speaker recognition system by abhishek mahajan (20)

Más de Abhishek Mahajan

Más de Abhishek Mahajan (19)

Último

Último (20)

Speaker recognition system by abhishek mahajan