SlideShare una empresa de Scribd logo
1 de 6
Descargar para leer sin conexión
IOSR Journal of VLSI and Signal Processing (IOSR-JVSP)
Volume 5, Issue 6, Ver. I (Nov -Dec. 2015), PP 40-45
e-ISSN: 2319 – 4200, p-ISSN No. : 2319 – 4197
www.iosrjournals.org
DOI: 10.9790/4200-05614045 www.iosrjournals.org 40 | Page
Implementation of Marathi Language Speech Databases for
Large Dictionary
Sangramsing Kayte1
, Monica Mundada1
, 2
Dr. Charansing Kayte
1
Department of Computer Science and Information Technology
Dr. Babasaheb Ambedkar Marathwada University, Aurangabad
2
Department of Digital and Cyber Forensic, Aurangabad Maharashtra, India
Abstract: In this research paper, we discuss our efforts in the development of Marathi language speech
databases in Marathi for building large vocabulary. We have collected speech data from about 5 speakers in
these one languages. We discuss the design and methodology of collection of speech databases. We also present
preliminary speech recognition results using the acoustic models created on these databases using Sphinx 2 and
Festvox speech tool kit.
I. Introduction
Most of the Data in digital world is accessible to a few who can read or understand a particular
language. Language technologies can provide solutions in the form of natural interfaces so that digital content
can reach to the masses and facilitate the exchange of information across different people speaking different
languages. These technologies play a crucial role in multi-lingual societies such as India which has about 1652
dialects/native languages. While Hindi written in Devanagari script, is the official language, the other 17
languages recognized by the constitution of India are: 1) Assamese 2) Tamil 3) Malayalam 4) Gujarati 5) Telugu
6) Oriya 7) Urdu 8) Bengali 9) Sanskrit 10) Kashmiri 11) Sindhi 12) Punjabi 13) Konkani 14) Marathi 15)
Manipuri 16) Kannada and 17) Nepali. Seamless integration of speech recognition, machine translation and
speech synthesis systems could facilitate the exchange of information between two people speaking two
different languages. Our overall goal is to develop speech recognition and speech synthesis systems for most of
these languages [1].
In the context of developing large vocabulary continuous speech recognition systems, there has been
effort by IBM to build a large vocabulary continuous speech recognition system in Hindi [2] by bootstrapping
existing English acoustic models. A more recent development was a Hindi recognition system from HP Labs,
India [3] which involved Hindi speech corpus collection and subsequent system building. In this paper, we
discuss the development of speech databases in Telugu, Tamil and Marathi languages for building large
vocabulary speech recognition systems. We also discuss some preliminary speech recognition results obtained
on these databases. We hope that this work will serve as a benchmark and impetus for the development of large
speech corpora and large vocabulary continuous speech recognition systems in other Indian languages
II. Text Corpora for Speech Databases
The first step we followed in creating a speech database for building an Automatic Speech Recognizer
(ASR) is the generation of an optimal set of textual sentences to be recorded from the native speakers of the
language [3]. The selected sentences should be minimal in number to save on manual recording effort and at the
same time have enough occurrences of each type of sounds to capture all types of coarticulation effects in the
chosen language. In this section, the various stages involved in the generation of the optimal text are described.
2.1.Text Corpus Collection
To select a set of phonetically rich sentences, one of the important decisions to be made is the choice of
a huge text corpus source from which the optimal sub-set has to be extracted. The reliability and coverage of the
optimal text and of the language model largely depends on the quality of the text corpus chosen. The corpus
should be unbiased and large enough to convey the entire syntactic behavior of the language. We have collected
the text corpora for the Marathi languages we are working on, from commonly available error free linguistic text
corpora of the languages wherever available in sufficient size. We have mostly used the CIIL Corpus [4] for all
the languages. In case of Marathi where the available linguistic corpus contained a lot of errors, we have
manually corrected some part of the corpora and also collected representative corpus of the language by
crawling content from websites of sakel, Lokamet newspapers of the respective languages. The table below
shows the size of the corpora and some other statistics [5].
Implement of Marathi Language Speech Databases for Large
DOI: 10.9790/4200-05614045 www.iosrjournals.org 41 | Page
Table 1: Analysis of the text Corpora
Language No. of
Sentences
No. of Unique
Words
No. of
Words
Marathi 102521 133273 1037647
2.1.1. Font Converters
To ease the analysis and work with the text corpus, it is required that the text is in a representation that
can be easily processed. Normally the text of a language from any web sources will be available in a specific
font. So, font converters have to be made to convert the corpus into the desired electronic character code. The
font will be reflective of the language and will have as many unique codes for each of the characters in the
language. It is necessary to ensure that there are no mapping and conversion errors.
2.2.Grapheme to phoneme Converters
The Text corpus has to be phonetized so that the distribution of the basic recognition units, the phones,
diphones, syllables, etc in the text corpus can be analyzed. Phonetizers or the converters are the tools that
convert the text corpus into its phonetic equivalent. Phonetizers are also required for generating the phonetic
lexicon, an important component of ASR systems [3]. The lexicon is a representation of each entry of the
vocabulary word set of the system in its phonetic form. The lexicon dictates to the decoder, the phonetic
composition, or the pronunciation of an entry. The process of corpora phonetization or the development of
phonetic lexicons for the western languages is traditionally done by linguists. These lexicons are subject to
constant refinement and modification. But the phonetic nature of Indian scripts reduces the effort to building
mere mapping tables and rules for the lexical representation. These rules and the mapping tables together
comprise the phonetizers or the Grapheme to Phoneme converters. The Telugu and the Marathi were derived
from the HP Labs Hindi [6]. Special ligatures have been used to denote nasalization, homoorganic nasals and
dependent vowels. Two rules were added to address the issues specific to Marathi. A rule was added to nasalize
the consonant that follows an anuswara. Another rule was added to substitute the Hindi chandrabindu with the
consonant n. The Tamil G2P is a one to one mapping table. Certain exceptions arise in the case of the vowels oa
and au vowels where the vowel comes before the consonant in the orthography but the pronunciation has the
vowel sound following the consonant. Such combinations have been identified and have been split accordingly
so as to ensure the one to one correspondence between the pronunciation and the phonetic lexicon [7] [8].
2.3.Optimal Text Selection
As mentioned in the earlier sections, the optimal set of sentences is extracted from the phonetized
corpus using an Optimal Text Selectionsystem which selects the sentences which are phonetically rich. These
sentences are distributed among the intended number of speakers such that good speaker variability is achieved
in the database. Also it is attempted to distribute the optimal text among speakers such that each speaker gets to
speak as many phonetic contexts as possible, The best-known OTS algorithm is the greedy algorithm as applied
to set covering problem [4]. In the greedy approach, a sentence is selected from the corpus if it best satisfies a
desired criterion. In general, the criterion for the optimal text selection would be the maximum number of
diphones or a highest score computed by a scoring function based on the type of the diphones that the sentence
is composed of. A threshold based approach has been implemented for OTS of the Marathi language ASRs. This
algorithm performs as well as the classical greedy algorithm in a significantly lesser time. A threshold is
maintained on the number of new diphones to be covered in a new sentence. A set of sentences is thus selected
in each iteration. The optimal text, hence selected, has a full coverage of the diphones present in the corpus.
These sentences are distributed among the speakers [9]. The table 2 gives the di-phone coverage
statistics of the optimal texts of each language.
Table 2: Di-phone coverage in the Marathilanguages [9]
Language Phones Diphones covered
Marathi 74 1779
III. Speech Data Collection
In this research section, the various steps difficult in building the speech corpus are detailed. Firstly, the
recording media is chosen so as to capture the effects due to channel and microphone variations. For the
databases that were built for the Marathi language ASRs, the speech data was recorded over calculated number
of landline and cellular phones using a multi-channel computer telephony interface card [5] [9].
Implement of Marathi Language Speech Databases for Large
DOI: 10.9790/4200-05614045 www.iosrjournals.org 42 | Page
3.1.Speaker Selection
Speech data is collected from the native speakers of the language who were comfortable in speaking
and reading the language. The speakers were chosen such that all the diversities attributing to the gender, age
and dialect are sufficiently captured. The recording is clean and has minimal background disturbance. Any
mistakes made while recording have been undone by re-recording or by making the corresponding changes in
the transcription set. The various statistics of the data collected for the Indian Language ASRs are given in the
next subsection [10].
3.2.Data Statistics
Speakers from various parts of the respective states (regions) were carefully recorded in order to cover
all possible dialectic variations of the language. Each speaker has recorded 52 sentences of the optimal text.
Table 3 gives the number of speakers recorded in each language and in each of the recording modes landline
and cellphones. To capture different microphonic variations, four different cellphones were used while
recording the speakers. Table 4 gives the age wise distribution of the speakers in the Marathi languages. Table 5
shows the gender wise distribution of the speakers in each of the Marathi languages [9].
Table 3: Number of speakers in the Marathi languages
Language Landline Cellphone Total
Marathi 92 84 176
Table 4: Age wise speaker distribution
Language 20-
25
25-30 30-35 35-40 > 40
Marathi 77 56 26 12 5
Table 5: Gender distribution among speakers
3.3.Transcription Correction
Despite the care taken to record the speech with minimal background noise and mistakes in
pronunciation, some errors have crept in while recording. These errors had to be identified manually by listening
to the speech. If felt unsuitable, some of the utterances have been discarded.In the case of the data collected in
the Marathi languages, the transcriptions were manually edited and ranked based on the goodness of the speech
recorded. The utterances were classified as Good, With Channel distortion, With Background Noiseand
Uselesswhichever is true. The pronunciation mistakes were carefully identified and if possible the
corresponding changes were made in the transcriptions so that the utterance and the transcription correspond to
each other. The idea behind the classification was to make the utmost utilization of the data and to serve as a
corpus for further related research work.
IV. Building ASR systems
Typically, ASR systems comprise three major constituents- the acoustic models, the language models
and the phonetic lexicon. In the subsections that follow, each of these components is discussed.
4.1. Acoustic Model
The first key constituent of the ASR systems is the acoustic model. Acoustic models capture the
characteristics of the basic recognition units [3]. The recognition units can be at the word level, syllable level
and or at the phoneme level.
Many constraints and inadequacies come into picture with the selection of each of these units. For large
vocabulary continuous speech recognition systems (LVCSR), phoneme is the preferred unit. Neural networks
(NN) and Hidden Markov models are the most commonly used for acoustic modeling of ASR systems. We have
chosen the Semi-Continuous Hidden Markov models (SCHMMs) [1] to represent context dependent phones
(trip-hones). At the time of recognition, various words are hypothesized against the speech signal. To compute
the likelihood of a word, the lexicon is referred and the word is broken into its constituent phones. The phone
likelihood is computed from the HMMs. The combined likelihood of all the phones represents the likelihood of
the word in the acoustic model. The tri-phone acoustic models built for the Indian language ASRs are 5 state
SCHMMs with states clustered using decision trees. The models were built using CMU Sphinx II trainer. Two
Language Male Female
Marathi 91 85
Implement of Marathi Language Speech Databases for Large
DOI: 10.9790/4200-05614045 www.iosrjournals.org 43 | Page
different models were trained from Landline and Cellphone data. The training data was carefully selected to
have an even distribution across all the possible variations. For each of the systems, the speech data collected
was divided into trainingand the testsets based upon the demographic details. 70% of the speakers were
included in the training set and the rest 30% was used to test the respective systems.
4.2. Language Model
The language model attempts to convey the behavior of the language. By the means of probabilistic
information, the language model provides the syntax of the language. It aims to predict the occurrence of
specific word sequences possible in the language. From the perspective of the recognition engine, the language
model helps narrow down the search space for a valid combination of words [3]. Language models help guide
and constrain the search among alternative word hypotheses during decoding. Most ASR systems use the
stochastic language models (SLM). Speech recognizers seek the word sequence Wb which is most likely to be
produced from the acoustic evidence A.
P (Wb | A) = maxw P (W|A) = maxw P (A|W) P (W)/P (A)
P (A), the probability of acoustic evidence is normally omitted as it is common for any word sequence.
The recognized word sequence is the word combination W b which maximizes the above equation. LMs assign a
probability estimate P (W) to word sequences W={w 1,w2, wn}. These probabilities can be trained from a
corpus. Since there is only a finite coverage of word sequences, some smoothing has to be done to discount the
frequencies to those of the lower order. SLMs use the N-gram LM where it is assumed that the probability of
occurrence of a word is dependent only on the past N-1 words. The trigram model [N=3], based on the previous
two words is particularly powerful, as most words have a strong dependence on the previous two words and it
can be estimated reasonably well with an attainable corpus. The trigram based language model with back-off
was used for recognition. The LM was created using the CMU statistical LM toolkit [11]. For the baseline
systems developed, the LM was generated from the training and the testing transcripts. The unigrams of the
language model were taken as the lexical entries for the systems.
V. Baseline System Performance
This section presents the performance of the recognition systems. Two systems were built in each
language using the landline and the mobile speech data separately. The tests on these recognition systems were
done using the speech data of the untrained 30% of the speakersdata. Goodutterances of these speakers were
decoded using the Sphinx II decoder. Appropriate tuning was done on the decoder to get the best performance.
The evaluation of the experiment was made according to the recognition accuracy and computed using the word
error rate [WER] metric which aligns a recognized word string against the correct word string and computes the
number of substitutions (S), deletions (D) and Insertions (I) and the number of words in the correct sentence
(N).
W.E.R = 100 * (S+D+I) / N
In the following subsections, the gradual improvement in the performance of the ASRs is shown. The
subsection 5.1 discusses the improvements obtained on iterations of Forced alignment. The subsection 5.2 deals
with the use of an extended phoneme set.
5.1.Forced Alignments
This is a well-known technique to improve the HMM of the Acoustic Model. The technique sounds
quite simple but the performance of the ASR dramatically improves by nearly 10% at times. The method
involves insertion of silence at appropriate places in the training transcription and retraining the models. Using
existing acoustic models, the silence is detected by aligning the speech signal against the trained acoustic
models and the corresponding transcription. The new transcriptions have silence markers at appropriate places,
the intermediate tied states and hence the overall models are improved. Multiple Forced alignment iterations
could be done to further improve the model. The performance eventually converges after a few iterations. The
table 6 shows the improvement in the performance of the recognition systems built from the landline data of
each language. The vocabulary sizes of the various systems built is shown in the table 7.
Table 6: Improvements in the WER of the landline recognition systems through a couple of Forced Alignment
iterations.
Language Initial
Performance
1st
iteration 2nd
iteration
Marathi 27.3 24 23.2
Table 7 Vocabulary size of each ASR (Number of lexical entries)
ASR System Vocabulary
Marathi 21640
Implement of Marathi Language Speech Databases for Large
DOI: 10.9790/4200-05614045 www.iosrjournals.org 44 | Page
5.2.Significance of Extended Phone Set
As mentioned earlier, lexicon defines the pronunciation of each lexical entry. The performance of the
ASR system depends on the phone set chosen. New sub word units can be identified and be introduced into the
phone set. The choice, however, should be such that, the variability of the phones and the complexity of the
HMM is reduced. It is also to be ensured that there are sufficient instances of all the phones. The phonetic split
in the lexicon will change accordingly as does the phone set. The Initial experiments were done using the
standard ISCII phone sets of the Marathi languages [12]. To improve the phone set, we have added the
geminates or the stressed consonants as standard phones. Stressed consonants, though represented as repetition
of a consonant twice in the orthography, do not follow the same pattern in pronunciation [3]. The consonant is
instead stressed, attributing some amount of plosiveness to it. The spectrogram of a consonant uttered twice
would display two similar observations coarticulated due to one another. The spectrogram of a stressed
consonant, on the other hand, would vividly display a blob of energy due to the stress. This clearly indicates that
stressed consonants need to be treated separately and not as variants of the parent consonant for acoustic
modeling. The table 8 shows the improved performances of all systems employing an extended phoneset. It also
shows the size of the lexicon of the systems.
Table 8: Improvements in %WER by including the stressed consonants.
ASR System Initial Improved Vocabulary
Marathi Landline 23.2 20.7 21640
Marathi Mobile 25.0 23.6 18912
VI. Speech to-Speech Application
The speech-speech demo was an attempt to develop an interesting application using the above systems and
integrating it with existing systems. It is a tourist aid application with a 145 word vocabulary and 1000
legitimate sentence templates within the tourist domain. The application converts a spoken source language
query into the target language utterance. A template based language translation module has been simulated to
convert the recognized text query into the target language. The Telugu and the Tamil Text-to-Speech (TTS) [13]
[14] engines were integrated to synthesize and speak out the translated query in the target language.
Figure1: Schematic representation of the Application
As illustrated in the schematic representation, the application has three components, the recognition
module (ASR), the translation module and the synthesis module (TTS). The Speech query of the user is taken by
the ASR and is decoded into a text query of the source language. This goes as an input to the simulated language
translation system. This identifies the template of the source language text query and converts it into the
corresponding template of the target language. The TTS engine converts the output into spoken speech query of
the target language. With the limitation of the available ASR and TTS engines, we could render the translation
within the following source-target language pairs.
(i) Marathi to Hindi
(ii) Hindi to Marathi
The finite size of the legitimate utterances has been exploited, giving a significant bias to the language
model. Given minimal background disturbances, the performance of the application was close to 100%.
Implement of Marathi Language Speech Databases for Large
DOI: 10.9790/4200-05614045 www.iosrjournals.org 45 | Page
VII. Conclusion and Future work
In this research paper, we discussed the optimal design and development of speech databases for
Marathi languages. We hope the simple methodology of database creation presented will serve as catalyst for
the creation of speech databases in all other Indian languages. We also created speech recognizers and presented
preliminary results on these databases. We hope the ASRs created will serve as baseline systems for further
research on improving the accuracies in each of the languages. Our future work is focused in tuning these
models and test them using language models built using a larger corpus.
References
[1]. Sangramsing Kayte, Dr. Bharti Gawali “Marathi Speech Synthesis: A review” International Journal on Recent and Innovation
Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 6 3708 – 3711
[2]. Sangramsing Kayte, Monica Mundada, Dr. CharansingKayte "A Review of Unit Selection Speech Synthesis International Journal
of Advanced Research in Computer Science and Software Engineering -Volume 5, Issue 10, October-2015
[3]. Singh, S. P., et al Building Large Vocabulary Speech Recognition Systems for Indian Languages , International Conference on
Natural Language Processing, 1:245-254, 2004.
[4]. http://tdil.mit.gov.in/corpora/ach-corpora.htm#tech Marathi CIIL corpus.
[5]. Sangramsing Kayte, Monica Mundada "Study of Marathi Phones for Synthesis of Marathi Speech from Text" International Journal
of Emerging Research in Management &Technology ISSN: 2278-9359 (Volume-4, Issue-10) October 2015
[6]. Kalika Bali, ParthaPratimTalukdar, , Tools for the development of a Hindi Speech Synthesis System, 5th ISCA Speech Synthesis
Workshop, Pittsburgh, pp.109- 114,2004
[7]. Monica Mundada, Bharti Gawali, Sangramsing Kayte "Recognition and classification of speech and its related fluency disorders"
International Journal of Computer Science and Information Technologies (IJCSIT)
[8]. Monica Mundada, Sangramsing Kayte, Dr. Bharti Gawali "Classification of Fluent and Dysfluent Speech Using KNN Classifier"
International Journal of Advanced Research in Computer Science and Software Engineering Volume 4, Issue 9, September 2014
[9]. Sangramsing Kayte, Monica Mundada, Dr. CharansingKayte "Di-phone-Based Concatenative Speech Synthesis System for Hindi"
International Journal of Advanced Research in Computer Science and Software Engineering -Volume 5, Issue 10, October-2015
[10]. Sangramsing Kayte, Monica Mundada, Santosh Gaikwad, Bharti Gawali "PERFORMANCE EVALUATION OF SPEECH
SYNTHESIS TECHNIQUES FOR ENGLISH LANGUAGE " International Congress on Information and Communication
Technology 9-10 October, 2015
[11]. Rosenfeld Roni, CMU statistical Language Modelling (SLM) Toolkit (Version 2).
[12]. IS13194:1991, Indian Script Code for Information Interchange ISCII-91, Bureau of Indian Standards, April,1999
[13]. Rohit Kumar et al. "Unit Selection Approach for building Indian Language TTS", Interspeech 2004,
[14]. Sangramsing N.kayte “Marathi Isolated-Word Automatic Speech Recognition System based on Vector Quantization (VQ)
approach” 101th Indian Science Congress Jammu University 03th Feb to 07 Feb 2014.

Más contenido relacionado

La actualidad más candente

ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGE
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGEADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGE
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGEijnlc
 
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGE
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGEADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGE
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGEkevig
 
Implementation of English-Text to Marathi-Speech (ETMS) Synthesizer
Implementation of English-Text to Marathi-Speech (ETMS) SynthesizerImplementation of English-Text to Marathi-Speech (ETMS) Synthesizer
Implementation of English-Text to Marathi-Speech (ETMS) SynthesizerIOSR Journals
 
Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid Approach
Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid ApproachPunjabi to Hindi Transliteration System for Proper Nouns Using Hybrid Approach
Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid ApproachIJERA Editor
 
HINDI AND MARATHI TO ENGLISH MACHINE TRANSLITERATION USING SVM
HINDI AND MARATHI TO ENGLISH MACHINE TRANSLITERATION USING SVMHINDI AND MARATHI TO ENGLISH MACHINE TRANSLITERATION USING SVM
HINDI AND MARATHI TO ENGLISH MACHINE TRANSLITERATION USING SVMijnlc
 
Design of A Spell Corrector For Hausa Language
Design of A Spell Corrector For Hausa LanguageDesign of A Spell Corrector For Hausa Language
Design of A Spell Corrector For Hausa LanguageWaqas Tariq
 
A Marathi Hidden-Markov Model Based Speech Synthesis System
A Marathi Hidden-Markov Model Based Speech Synthesis SystemA Marathi Hidden-Markov Model Based Speech Synthesis System
A Marathi Hidden-Markov Model Based Speech Synthesis Systemiosrjce
 
Segmentation Words for Speech Synthesis in Persian Language Based On Silence
Segmentation Words for Speech Synthesis in Persian Language Based On SilenceSegmentation Words for Speech Synthesis in Persian Language Based On Silence
Segmentation Words for Speech Synthesis in Persian Language Based On Silencepaperpublications3
 
A Corpus-Based Concatenative Speech Synthesis System for Marathi
A Corpus-Based Concatenative Speech Synthesis System for MarathiA Corpus-Based Concatenative Speech Synthesis System for Marathi
A Corpus-Based Concatenative Speech Synthesis System for Marathiiosrjce
 
An optimized approach to voice translation on mobile phones
An optimized approach to voice translation on mobile phonesAn optimized approach to voice translation on mobile phones
An optimized approach to voice translation on mobile phoneseSAT Publishing House
 
An optimized approach to voice translation on mobile phones
An optimized approach to voice translation on mobile phonesAn optimized approach to voice translation on mobile phones
An optimized approach to voice translation on mobile phoneseSAT Journals
 
FORMANT ANALYSIS OF BANGLA VOWEL FOR AUTOMATIC SPEECH RECOGNITION
FORMANT ANALYSIS OF BANGLA VOWEL FOR AUTOMATIC SPEECH RECOGNITIONFORMANT ANALYSIS OF BANGLA VOWEL FOR AUTOMATIC SPEECH RECOGNITION
FORMANT ANALYSIS OF BANGLA VOWEL FOR AUTOMATIC SPEECH RECOGNITIONsipij
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentIJERD Editor
 
BOOTSTRAPPING METHOD FOR DEVELOPING PART-OF-SPEECH TAGGED CORPUS IN LOW RESOU...
BOOTSTRAPPING METHOD FOR DEVELOPING PART-OF-SPEECH TAGGED CORPUS IN LOW RESOU...BOOTSTRAPPING METHOD FOR DEVELOPING PART-OF-SPEECH TAGGED CORPUS IN LOW RESOU...
BOOTSTRAPPING METHOD FOR DEVELOPING PART-OF-SPEECH TAGGED CORPUS IN LOW RESOU...ijnlc
 

La actualidad más candente (18)

ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGE
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGEADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGE
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGE
 
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGE
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGEADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGE
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGE
 
Ey4301913917
Ey4301913917Ey4301913917
Ey4301913917
 
Implementation of English-Text to Marathi-Speech (ETMS) Synthesizer
Implementation of English-Text to Marathi-Speech (ETMS) SynthesizerImplementation of English-Text to Marathi-Speech (ETMS) Synthesizer
Implementation of English-Text to Marathi-Speech (ETMS) Synthesizer
 
Cf32516518
Cf32516518Cf32516518
Cf32516518
 
I1 geetha3 revathi
I1 geetha3 revathiI1 geetha3 revathi
I1 geetha3 revathi
 
Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid Approach
Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid ApproachPunjabi to Hindi Transliteration System for Proper Nouns Using Hybrid Approach
Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid Approach
 
HINDI AND MARATHI TO ENGLISH MACHINE TRANSLITERATION USING SVM
HINDI AND MARATHI TO ENGLISH MACHINE TRANSLITERATION USING SVMHINDI AND MARATHI TO ENGLISH MACHINE TRANSLITERATION USING SVM
HINDI AND MARATHI TO ENGLISH MACHINE TRANSLITERATION USING SVM
 
Design of A Spell Corrector For Hausa Language
Design of A Spell Corrector For Hausa LanguageDesign of A Spell Corrector For Hausa Language
Design of A Spell Corrector For Hausa Language
 
551 466-472
551 466-472551 466-472
551 466-472
 
A Marathi Hidden-Markov Model Based Speech Synthesis System
A Marathi Hidden-Markov Model Based Speech Synthesis SystemA Marathi Hidden-Markov Model Based Speech Synthesis System
A Marathi Hidden-Markov Model Based Speech Synthesis System
 
Segmentation Words for Speech Synthesis in Persian Language Based On Silence
Segmentation Words for Speech Synthesis in Persian Language Based On SilenceSegmentation Words for Speech Synthesis in Persian Language Based On Silence
Segmentation Words for Speech Synthesis in Persian Language Based On Silence
 
A Corpus-Based Concatenative Speech Synthesis System for Marathi
A Corpus-Based Concatenative Speech Synthesis System for MarathiA Corpus-Based Concatenative Speech Synthesis System for Marathi
A Corpus-Based Concatenative Speech Synthesis System for Marathi
 
An optimized approach to voice translation on mobile phones
An optimized approach to voice translation on mobile phonesAn optimized approach to voice translation on mobile phones
An optimized approach to voice translation on mobile phones
 
An optimized approach to voice translation on mobile phones
An optimized approach to voice translation on mobile phonesAn optimized approach to voice translation on mobile phones
An optimized approach to voice translation on mobile phones
 
FORMANT ANALYSIS OF BANGLA VOWEL FOR AUTOMATIC SPEECH RECOGNITION
FORMANT ANALYSIS OF BANGLA VOWEL FOR AUTOMATIC SPEECH RECOGNITIONFORMANT ANALYSIS OF BANGLA VOWEL FOR AUTOMATIC SPEECH RECOGNITION
FORMANT ANALYSIS OF BANGLA VOWEL FOR AUTOMATIC SPEECH RECOGNITION
 
International Journal of Engineering Research and Development
International Journal of Engineering Research and DevelopmentInternational Journal of Engineering Research and Development
International Journal of Engineering Research and Development
 
BOOTSTRAPPING METHOD FOR DEVELOPING PART-OF-SPEECH TAGGED CORPUS IN LOW RESOU...
BOOTSTRAPPING METHOD FOR DEVELOPING PART-OF-SPEECH TAGGED CORPUS IN LOW RESOU...BOOTSTRAPPING METHOD FOR DEVELOPING PART-OF-SPEECH TAGGED CORPUS IN LOW RESOU...
BOOTSTRAPPING METHOD FOR DEVELOPING PART-OF-SPEECH TAGGED CORPUS IN LOW RESOU...
 

Similar a Implementation of Marathi Language Speech Databases for Large Dictionary

Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...iosrjce
 
DEVELOPMENT OF PHONEME DOMINATED DATABASE FOR LIMITED DOMAIN T-T-S IN HINDI
DEVELOPMENT OF PHONEME DOMINATED DATABASE FOR LIMITED DOMAIN T-T-S IN HINDIDEVELOPMENT OF PHONEME DOMINATED DATABASE FOR LIMITED DOMAIN T-T-S IN HINDI
DEVELOPMENT OF PHONEME DOMINATED DATABASE FOR LIMITED DOMAIN T-T-S IN HINDIijaia
 
Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...
Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...
Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...IJERA Editor
 
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...iosrjce
 
Grapheme-To-Phoneme Tools for the Marathi Speech Synthesis
Grapheme-To-Phoneme Tools for the Marathi Speech SynthesisGrapheme-To-Phoneme Tools for the Marathi Speech Synthesis
Grapheme-To-Phoneme Tools for the Marathi Speech SynthesisIJERA Editor
 
A Context-based Numeral Reading Technique for Text to Speech Systems
A Context-based Numeral Reading Technique for Text to Speech Systems A Context-based Numeral Reading Technique for Text to Speech Systems
A Context-based Numeral Reading Technique for Text to Speech Systems IJECEIAES
 
AMHARIC TEXT TO SPEECH SYNTHESIS FOR SYSTEM DEVELOPMENT
AMHARIC TEXT TO SPEECH SYNTHESIS FOR SYSTEM DEVELOPMENTAMHARIC TEXT TO SPEECH SYNTHESIS FOR SYSTEM DEVELOPMENT
AMHARIC TEXT TO SPEECH SYNTHESIS FOR SYSTEM DEVELOPMENTNathan Mathis
 
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...ijma
 
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...ijma
 
Text To Speech Synthesis System For Marathi Language Using Concatenation Tech...
Text To Speech Synthesis System For Marathi Language Using Concatenation Tech...Text To Speech Synthesis System For Marathi Language Using Concatenation Tech...
Text To Speech Synthesis System For Marathi Language Using Concatenation Tech...University of Southern Denmark
 
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...kevig
 
Design and Development of a Malayalam to English Translator- A Transfer Based...
Design and Development of a Malayalam to English Translator- A Transfer Based...Design and Development of a Malayalam to English Translator- A Transfer Based...
Design and Development of a Malayalam to English Translator- A Transfer Based...Waqas Tariq
 
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...ijnlc
 
A decision tree based word sense disambiguation system in manipuri language
A decision tree based word sense disambiguation system in manipuri languageA decision tree based word sense disambiguation system in manipuri language
A decision tree based word sense disambiguation system in manipuri languageacijjournal
 
A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES
A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES
A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES ijnlc
 
INTEGRATION OF PHONOTACTIC FEATURES FOR LANGUAGE IDENTIFICATION ON CODE-SWITC...
INTEGRATION OF PHONOTACTIC FEATURES FOR LANGUAGE IDENTIFICATION ON CODE-SWITC...INTEGRATION OF PHONOTACTIC FEATURES FOR LANGUAGE IDENTIFICATION ON CODE-SWITC...
INTEGRATION OF PHONOTACTIC FEATURES FOR LANGUAGE IDENTIFICATION ON CODE-SWITC...kevig
 

Similar a Implementation of Marathi Language Speech Databases for Large Dictionary (20)

G1803013542
G1803013542G1803013542
G1803013542
 
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
 
DEVELOPMENT OF PHONEME DOMINATED DATABASE FOR LIMITED DOMAIN T-T-S IN HINDI
DEVELOPMENT OF PHONEME DOMINATED DATABASE FOR LIMITED DOMAIN T-T-S IN HINDIDEVELOPMENT OF PHONEME DOMINATED DATABASE FOR LIMITED DOMAIN T-T-S IN HINDI
DEVELOPMENT OF PHONEME DOMINATED DATABASE FOR LIMITED DOMAIN T-T-S IN HINDI
 
F017163443
F017163443F017163443
F017163443
 
Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...
Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...
Approach To Build A Marathi Text-To-Speech System Using Concatenative Synthes...
 
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
 
Ijetcas14 444
Ijetcas14 444Ijetcas14 444
Ijetcas14 444
 
Grapheme-To-Phoneme Tools for the Marathi Speech Synthesis
Grapheme-To-Phoneme Tools for the Marathi Speech SynthesisGrapheme-To-Phoneme Tools for the Marathi Speech Synthesis
Grapheme-To-Phoneme Tools for the Marathi Speech Synthesis
 
A Context-based Numeral Reading Technique for Text to Speech Systems
A Context-based Numeral Reading Technique for Text to Speech Systems A Context-based Numeral Reading Technique for Text to Speech Systems
A Context-based Numeral Reading Technique for Text to Speech Systems
 
AMHARIC TEXT TO SPEECH SYNTHESIS FOR SYSTEM DEVELOPMENT
AMHARIC TEXT TO SPEECH SYNTHESIS FOR SYSTEM DEVELOPMENTAMHARIC TEXT TO SPEECH SYNTHESIS FOR SYSTEM DEVELOPMENT
AMHARIC TEXT TO SPEECH SYNTHESIS FOR SYSTEM DEVELOPMENT
 
ReseachPaper
ReseachPaperReseachPaper
ReseachPaper
 
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
 
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...
 
Text To Speech Synthesis System For Marathi Language Using Concatenation Tech...
Text To Speech Synthesis System For Marathi Language Using Concatenation Tech...Text To Speech Synthesis System For Marathi Language Using Concatenation Tech...
Text To Speech Synthesis System For Marathi Language Using Concatenation Tech...
 
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
 
Design and Development of a Malayalam to English Translator- A Transfer Based...
Design and Development of a Malayalam to English Translator- A Transfer Based...Design and Development of a Malayalam to English Translator- A Transfer Based...
Design and Development of a Malayalam to English Translator- A Transfer Based...
 
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
 
A decision tree based word sense disambiguation system in manipuri language
A decision tree based word sense disambiguation system in manipuri languageA decision tree based word sense disambiguation system in manipuri language
A decision tree based word sense disambiguation system in manipuri language
 
A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES
A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES
A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES
 
INTEGRATION OF PHONOTACTIC FEATURES FOR LANGUAGE IDENTIFICATION ON CODE-SWITC...
INTEGRATION OF PHONOTACTIC FEATURES FOR LANGUAGE IDENTIFICATION ON CODE-SWITC...INTEGRATION OF PHONOTACTIC FEATURES FOR LANGUAGE IDENTIFICATION ON CODE-SWITC...
INTEGRATION OF PHONOTACTIC FEATURES FOR LANGUAGE IDENTIFICATION ON CODE-SWITC...
 

Más de iosrjce

An Examination of Effectuation Dimension as Financing Practice of Small and M...
An Examination of Effectuation Dimension as Financing Practice of Small and M...An Examination of Effectuation Dimension as Financing Practice of Small and M...
An Examination of Effectuation Dimension as Financing Practice of Small and M...iosrjce
 
Does Goods and Services Tax (GST) Leads to Indian Economic Development?
Does Goods and Services Tax (GST) Leads to Indian Economic Development?Does Goods and Services Tax (GST) Leads to Indian Economic Development?
Does Goods and Services Tax (GST) Leads to Indian Economic Development?iosrjce
 
Childhood Factors that influence success in later life
Childhood Factors that influence success in later lifeChildhood Factors that influence success in later life
Childhood Factors that influence success in later lifeiosrjce
 
Emotional Intelligence and Work Performance Relationship: A Study on Sales Pe...
Emotional Intelligence and Work Performance Relationship: A Study on Sales Pe...Emotional Intelligence and Work Performance Relationship: A Study on Sales Pe...
Emotional Intelligence and Work Performance Relationship: A Study on Sales Pe...iosrjce
 
Customer’s Acceptance of Internet Banking in Dubai
Customer’s Acceptance of Internet Banking in DubaiCustomer’s Acceptance of Internet Banking in Dubai
Customer’s Acceptance of Internet Banking in Dubaiiosrjce
 
A Study of Employee Satisfaction relating to Job Security & Working Hours amo...
A Study of Employee Satisfaction relating to Job Security & Working Hours amo...A Study of Employee Satisfaction relating to Job Security & Working Hours amo...
A Study of Employee Satisfaction relating to Job Security & Working Hours amo...iosrjce
 
Consumer Perspectives on Brand Preference: A Choice Based Model Approach
Consumer Perspectives on Brand Preference: A Choice Based Model ApproachConsumer Perspectives on Brand Preference: A Choice Based Model Approach
Consumer Perspectives on Brand Preference: A Choice Based Model Approachiosrjce
 
Student`S Approach towards Social Network Sites
Student`S Approach towards Social Network SitesStudent`S Approach towards Social Network Sites
Student`S Approach towards Social Network Sitesiosrjce
 
Broadcast Management in Nigeria: The systems approach as an imperative
Broadcast Management in Nigeria: The systems approach as an imperativeBroadcast Management in Nigeria: The systems approach as an imperative
Broadcast Management in Nigeria: The systems approach as an imperativeiosrjce
 
A Study on Retailer’s Perception on Soya Products with Special Reference to T...
A Study on Retailer’s Perception on Soya Products with Special Reference to T...A Study on Retailer’s Perception on Soya Products with Special Reference to T...
A Study on Retailer’s Perception on Soya Products with Special Reference to T...iosrjce
 
A Study Factors Influence on Organisation Citizenship Behaviour in Corporate ...
A Study Factors Influence on Organisation Citizenship Behaviour in Corporate ...A Study Factors Influence on Organisation Citizenship Behaviour in Corporate ...
A Study Factors Influence on Organisation Citizenship Behaviour in Corporate ...iosrjce
 
Consumers’ Behaviour on Sony Xperia: A Case Study on Bangladesh
Consumers’ Behaviour on Sony Xperia: A Case Study on BangladeshConsumers’ Behaviour on Sony Xperia: A Case Study on Bangladesh
Consumers’ Behaviour on Sony Xperia: A Case Study on Bangladeshiosrjce
 
Design of a Balanced Scorecard on Nonprofit Organizations (Study on Yayasan P...
Design of a Balanced Scorecard on Nonprofit Organizations (Study on Yayasan P...Design of a Balanced Scorecard on Nonprofit Organizations (Study on Yayasan P...
Design of a Balanced Scorecard on Nonprofit Organizations (Study on Yayasan P...iosrjce
 
Public Sector Reforms and Outsourcing Services in Nigeria: An Empirical Evalu...
Public Sector Reforms and Outsourcing Services in Nigeria: An Empirical Evalu...Public Sector Reforms and Outsourcing Services in Nigeria: An Empirical Evalu...
Public Sector Reforms and Outsourcing Services in Nigeria: An Empirical Evalu...iosrjce
 
Media Innovations and its Impact on Brand awareness & Consideration
Media Innovations and its Impact on Brand awareness & ConsiderationMedia Innovations and its Impact on Brand awareness & Consideration
Media Innovations and its Impact on Brand awareness & Considerationiosrjce
 
Customer experience in supermarkets and hypermarkets – A comparative study
Customer experience in supermarkets and hypermarkets – A comparative studyCustomer experience in supermarkets and hypermarkets – A comparative study
Customer experience in supermarkets and hypermarkets – A comparative studyiosrjce
 
Social Media and Small Businesses: A Combinational Strategic Approach under t...
Social Media and Small Businesses: A Combinational Strategic Approach under t...Social Media and Small Businesses: A Combinational Strategic Approach under t...
Social Media and Small Businesses: A Combinational Strategic Approach under t...iosrjce
 
Secretarial Performance and the Gender Question (A Study of Selected Tertiary...
Secretarial Performance and the Gender Question (A Study of Selected Tertiary...Secretarial Performance and the Gender Question (A Study of Selected Tertiary...
Secretarial Performance and the Gender Question (A Study of Selected Tertiary...iosrjce
 
Implementation of Quality Management principles at Zimbabwe Open University (...
Implementation of Quality Management principles at Zimbabwe Open University (...Implementation of Quality Management principles at Zimbabwe Open University (...
Implementation of Quality Management principles at Zimbabwe Open University (...iosrjce
 
Organizational Conflicts Management In Selected Organizaions In Lagos State, ...
Organizational Conflicts Management In Selected Organizaions In Lagos State, ...Organizational Conflicts Management In Selected Organizaions In Lagos State, ...
Organizational Conflicts Management In Selected Organizaions In Lagos State, ...iosrjce
 

Más de iosrjce (20)

An Examination of Effectuation Dimension as Financing Practice of Small and M...
An Examination of Effectuation Dimension as Financing Practice of Small and M...An Examination of Effectuation Dimension as Financing Practice of Small and M...
An Examination of Effectuation Dimension as Financing Practice of Small and M...
 
Does Goods and Services Tax (GST) Leads to Indian Economic Development?
Does Goods and Services Tax (GST) Leads to Indian Economic Development?Does Goods and Services Tax (GST) Leads to Indian Economic Development?
Does Goods and Services Tax (GST) Leads to Indian Economic Development?
 
Childhood Factors that influence success in later life
Childhood Factors that influence success in later lifeChildhood Factors that influence success in later life
Childhood Factors that influence success in later life
 
Emotional Intelligence and Work Performance Relationship: A Study on Sales Pe...
Emotional Intelligence and Work Performance Relationship: A Study on Sales Pe...Emotional Intelligence and Work Performance Relationship: A Study on Sales Pe...
Emotional Intelligence and Work Performance Relationship: A Study on Sales Pe...
 
Customer’s Acceptance of Internet Banking in Dubai
Customer’s Acceptance of Internet Banking in DubaiCustomer’s Acceptance of Internet Banking in Dubai
Customer’s Acceptance of Internet Banking in Dubai
 
A Study of Employee Satisfaction relating to Job Security & Working Hours amo...
A Study of Employee Satisfaction relating to Job Security & Working Hours amo...A Study of Employee Satisfaction relating to Job Security & Working Hours amo...
A Study of Employee Satisfaction relating to Job Security & Working Hours amo...
 
Consumer Perspectives on Brand Preference: A Choice Based Model Approach
Consumer Perspectives on Brand Preference: A Choice Based Model ApproachConsumer Perspectives on Brand Preference: A Choice Based Model Approach
Consumer Perspectives on Brand Preference: A Choice Based Model Approach
 
Student`S Approach towards Social Network Sites
Student`S Approach towards Social Network SitesStudent`S Approach towards Social Network Sites
Student`S Approach towards Social Network Sites
 
Broadcast Management in Nigeria: The systems approach as an imperative
Broadcast Management in Nigeria: The systems approach as an imperativeBroadcast Management in Nigeria: The systems approach as an imperative
Broadcast Management in Nigeria: The systems approach as an imperative
 
A Study on Retailer’s Perception on Soya Products with Special Reference to T...
A Study on Retailer’s Perception on Soya Products with Special Reference to T...A Study on Retailer’s Perception on Soya Products with Special Reference to T...
A Study on Retailer’s Perception on Soya Products with Special Reference to T...
 
A Study Factors Influence on Organisation Citizenship Behaviour in Corporate ...
A Study Factors Influence on Organisation Citizenship Behaviour in Corporate ...A Study Factors Influence on Organisation Citizenship Behaviour in Corporate ...
A Study Factors Influence on Organisation Citizenship Behaviour in Corporate ...
 
Consumers’ Behaviour on Sony Xperia: A Case Study on Bangladesh
Consumers’ Behaviour on Sony Xperia: A Case Study on BangladeshConsumers’ Behaviour on Sony Xperia: A Case Study on Bangladesh
Consumers’ Behaviour on Sony Xperia: A Case Study on Bangladesh
 
Design of a Balanced Scorecard on Nonprofit Organizations (Study on Yayasan P...
Design of a Balanced Scorecard on Nonprofit Organizations (Study on Yayasan P...Design of a Balanced Scorecard on Nonprofit Organizations (Study on Yayasan P...
Design of a Balanced Scorecard on Nonprofit Organizations (Study on Yayasan P...
 
Public Sector Reforms and Outsourcing Services in Nigeria: An Empirical Evalu...
Public Sector Reforms and Outsourcing Services in Nigeria: An Empirical Evalu...Public Sector Reforms and Outsourcing Services in Nigeria: An Empirical Evalu...
Public Sector Reforms and Outsourcing Services in Nigeria: An Empirical Evalu...
 
Media Innovations and its Impact on Brand awareness & Consideration
Media Innovations and its Impact on Brand awareness & ConsiderationMedia Innovations and its Impact on Brand awareness & Consideration
Media Innovations and its Impact on Brand awareness & Consideration
 
Customer experience in supermarkets and hypermarkets – A comparative study
Customer experience in supermarkets and hypermarkets – A comparative studyCustomer experience in supermarkets and hypermarkets – A comparative study
Customer experience in supermarkets and hypermarkets – A comparative study
 
Social Media and Small Businesses: A Combinational Strategic Approach under t...
Social Media and Small Businesses: A Combinational Strategic Approach under t...Social Media and Small Businesses: A Combinational Strategic Approach under t...
Social Media and Small Businesses: A Combinational Strategic Approach under t...
 
Secretarial Performance and the Gender Question (A Study of Selected Tertiary...
Secretarial Performance and the Gender Question (A Study of Selected Tertiary...Secretarial Performance and the Gender Question (A Study of Selected Tertiary...
Secretarial Performance and the Gender Question (A Study of Selected Tertiary...
 
Implementation of Quality Management principles at Zimbabwe Open University (...
Implementation of Quality Management principles at Zimbabwe Open University (...Implementation of Quality Management principles at Zimbabwe Open University (...
Implementation of Quality Management principles at Zimbabwe Open University (...
 
Organizational Conflicts Management In Selected Organizaions In Lagos State, ...
Organizational Conflicts Management In Selected Organizaions In Lagos State, ...Organizational Conflicts Management In Selected Organizaions In Lagos State, ...
Organizational Conflicts Management In Selected Organizaions In Lagos State, ...
 

Último

Design Portfolio - 2024 - William Vickery
Design Portfolio - 2024 - William VickeryDesign Portfolio - 2024 - William Vickery
Design Portfolio - 2024 - William VickeryWilliamVickery6
 
Call Us ✡️97111⇛47426⇛Call In girls Vasant Vihar༒(Delhi)
Call Us ✡️97111⇛47426⇛Call In girls Vasant Vihar༒(Delhi)Call Us ✡️97111⇛47426⇛Call In girls Vasant Vihar༒(Delhi)
Call Us ✡️97111⇛47426⇛Call In girls Vasant Vihar༒(Delhi)jennyeacort
 
group_15_empirya_p1projectIndustrial.pdf
group_15_empirya_p1projectIndustrial.pdfgroup_15_empirya_p1projectIndustrial.pdf
group_15_empirya_p1projectIndustrial.pdfneelspinoy
 
办理学位证(TheAuckland证书)新西兰奥克兰大学毕业证成绩单原版一比一
办理学位证(TheAuckland证书)新西兰奥克兰大学毕业证成绩单原版一比一办理学位证(TheAuckland证书)新西兰奥克兰大学毕业证成绩单原版一比一
办理学位证(TheAuckland证书)新西兰奥克兰大学毕业证成绩单原版一比一Fi L
 
MT. Marseille an Archipelago. Strategies for Integrating Residential Communit...
MT. Marseille an Archipelago. Strategies for Integrating Residential Communit...MT. Marseille an Archipelago. Strategies for Integrating Residential Communit...
MT. Marseille an Archipelago. Strategies for Integrating Residential Communit...katerynaivanenko1
 
昆士兰大学毕业证(UQ毕业证)#文凭成绩单#真实留信学历认证永久存档
昆士兰大学毕业证(UQ毕业证)#文凭成绩单#真实留信学历认证永久存档昆士兰大学毕业证(UQ毕业证)#文凭成绩单#真实留信学历认证永久存档
昆士兰大学毕业证(UQ毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
毕业文凭制作#回国入职#diploma#degree澳洲弗林德斯大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲弗林德斯大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree 毕业文凭制作#回国入职#diploma#degree澳洲弗林德斯大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲弗林德斯大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree ttt fff
 
CREATING A POSITIVE SCHOOL CULTURE CHAPTER 10
CREATING A POSITIVE SCHOOL CULTURE CHAPTER 10CREATING A POSITIVE SCHOOL CULTURE CHAPTER 10
CREATING A POSITIVE SCHOOL CULTURE CHAPTER 10uasjlagroup
 
原版美国亚利桑那州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
原版美国亚利桑那州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree原版美国亚利桑那州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
原版美国亚利桑那州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
(办理学位证)埃迪斯科文大学毕业证成绩单原版一比一
(办理学位证)埃迪斯科文大学毕业证成绩单原版一比一(办理学位证)埃迪斯科文大学毕业证成绩单原版一比一
(办理学位证)埃迪斯科文大学毕业证成绩单原版一比一Fi sss
 
Top 10 Modern Web Design Trends for 2025
Top 10 Modern Web Design Trends for 2025Top 10 Modern Web Design Trends for 2025
Top 10 Modern Web Design Trends for 2025Rndexperts
 
PORTAFOLIO 2024_ ANASTASIYA KUDINOVA
PORTAFOLIO   2024_  ANASTASIYA  KUDINOVAPORTAFOLIO   2024_  ANASTASIYA  KUDINOVA
PORTAFOLIO 2024_ ANASTASIYA KUDINOVAAnastasiya Kudinova
 
办理(UC毕业证书)查尔斯顿大学毕业证成绩单原版一比一
办理(UC毕业证书)查尔斯顿大学毕业证成绩单原版一比一办理(UC毕业证书)查尔斯顿大学毕业证成绩单原版一比一
办理(UC毕业证书)查尔斯顿大学毕业证成绩单原版一比一z xss
 
在线办理ohio毕业证俄亥俄大学毕业证成绩单留信学历认证
在线办理ohio毕业证俄亥俄大学毕业证成绩单留信学历认证在线办理ohio毕业证俄亥俄大学毕业证成绩单留信学历认证
在线办理ohio毕业证俄亥俄大学毕业证成绩单留信学历认证nhjeo1gg
 
Unveiling the Future: Columbus, Ohio Condominiums Through the Lens of 3D Arch...
Unveiling the Future: Columbus, Ohio Condominiums Through the Lens of 3D Arch...Unveiling the Future: Columbus, Ohio Condominiums Through the Lens of 3D Arch...
Unveiling the Future: Columbus, Ohio Condominiums Through the Lens of 3D Arch...Yantram Animation Studio Corporation
 
2024新版美国旧金山州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
2024新版美国旧金山州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree2024新版美国旧金山州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
2024新版美国旧金山州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degreeyuu sss
 
FiveHypotheses_UIDMasterclass_18April2024.pdf
FiveHypotheses_UIDMasterclass_18April2024.pdfFiveHypotheses_UIDMasterclass_18April2024.pdf
FiveHypotheses_UIDMasterclass_18April2024.pdfShivakumar Viswanathan
 
Design principles on typography in design
Design principles on typography in designDesign principles on typography in design
Design principles on typography in designnooreen17
 
Architecture case study India Habitat Centre, Delhi.pdf
Architecture case study India Habitat Centre, Delhi.pdfArchitecture case study India Habitat Centre, Delhi.pdf
Architecture case study India Habitat Centre, Delhi.pdfSumit Lathwal
 
西北大学毕业证学位证成绩单-怎么样办伪造
西北大学毕业证学位证成绩单-怎么样办伪造西北大学毕业证学位证成绩单-怎么样办伪造
西北大学毕业证学位证成绩单-怎么样办伪造kbdhl05e
 

Último (20)

Design Portfolio - 2024 - William Vickery
Design Portfolio - 2024 - William VickeryDesign Portfolio - 2024 - William Vickery
Design Portfolio - 2024 - William Vickery
 
Call Us ✡️97111⇛47426⇛Call In girls Vasant Vihar༒(Delhi)
Call Us ✡️97111⇛47426⇛Call In girls Vasant Vihar༒(Delhi)Call Us ✡️97111⇛47426⇛Call In girls Vasant Vihar༒(Delhi)
Call Us ✡️97111⇛47426⇛Call In girls Vasant Vihar༒(Delhi)
 
group_15_empirya_p1projectIndustrial.pdf
group_15_empirya_p1projectIndustrial.pdfgroup_15_empirya_p1projectIndustrial.pdf
group_15_empirya_p1projectIndustrial.pdf
 
办理学位证(TheAuckland证书)新西兰奥克兰大学毕业证成绩单原版一比一
办理学位证(TheAuckland证书)新西兰奥克兰大学毕业证成绩单原版一比一办理学位证(TheAuckland证书)新西兰奥克兰大学毕业证成绩单原版一比一
办理学位证(TheAuckland证书)新西兰奥克兰大学毕业证成绩单原版一比一
 
MT. Marseille an Archipelago. Strategies for Integrating Residential Communit...
MT. Marseille an Archipelago. Strategies for Integrating Residential Communit...MT. Marseille an Archipelago. Strategies for Integrating Residential Communit...
MT. Marseille an Archipelago. Strategies for Integrating Residential Communit...
 
昆士兰大学毕业证(UQ毕业证)#文凭成绩单#真实留信学历认证永久存档
昆士兰大学毕业证(UQ毕业证)#文凭成绩单#真实留信学历认证永久存档昆士兰大学毕业证(UQ毕业证)#文凭成绩单#真实留信学历认证永久存档
昆士兰大学毕业证(UQ毕业证)#文凭成绩单#真实留信学历认证永久存档
 
毕业文凭制作#回国入职#diploma#degree澳洲弗林德斯大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲弗林德斯大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree 毕业文凭制作#回国入职#diploma#degree澳洲弗林德斯大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
毕业文凭制作#回国入职#diploma#degree澳洲弗林德斯大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
CREATING A POSITIVE SCHOOL CULTURE CHAPTER 10
CREATING A POSITIVE SCHOOL CULTURE CHAPTER 10CREATING A POSITIVE SCHOOL CULTURE CHAPTER 10
CREATING A POSITIVE SCHOOL CULTURE CHAPTER 10
 
原版美国亚利桑那州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
原版美国亚利桑那州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree原版美国亚利桑那州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
原版美国亚利桑那州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
(办理学位证)埃迪斯科文大学毕业证成绩单原版一比一
(办理学位证)埃迪斯科文大学毕业证成绩单原版一比一(办理学位证)埃迪斯科文大学毕业证成绩单原版一比一
(办理学位证)埃迪斯科文大学毕业证成绩单原版一比一
 
Top 10 Modern Web Design Trends for 2025
Top 10 Modern Web Design Trends for 2025Top 10 Modern Web Design Trends for 2025
Top 10 Modern Web Design Trends for 2025
 
PORTAFOLIO 2024_ ANASTASIYA KUDINOVA
PORTAFOLIO   2024_  ANASTASIYA  KUDINOVAPORTAFOLIO   2024_  ANASTASIYA  KUDINOVA
PORTAFOLIO 2024_ ANASTASIYA KUDINOVA
 
办理(UC毕业证书)查尔斯顿大学毕业证成绩单原版一比一
办理(UC毕业证书)查尔斯顿大学毕业证成绩单原版一比一办理(UC毕业证书)查尔斯顿大学毕业证成绩单原版一比一
办理(UC毕业证书)查尔斯顿大学毕业证成绩单原版一比一
 
在线办理ohio毕业证俄亥俄大学毕业证成绩单留信学历认证
在线办理ohio毕业证俄亥俄大学毕业证成绩单留信学历认证在线办理ohio毕业证俄亥俄大学毕业证成绩单留信学历认证
在线办理ohio毕业证俄亥俄大学毕业证成绩单留信学历认证
 
Unveiling the Future: Columbus, Ohio Condominiums Through the Lens of 3D Arch...
Unveiling the Future: Columbus, Ohio Condominiums Through the Lens of 3D Arch...Unveiling the Future: Columbus, Ohio Condominiums Through the Lens of 3D Arch...
Unveiling the Future: Columbus, Ohio Condominiums Through the Lens of 3D Arch...
 
2024新版美国旧金山州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
2024新版美国旧金山州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree2024新版美国旧金山州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
2024新版美国旧金山州立大学毕业证成绩单pdf电子版制作修改#毕业文凭制作#回国入职#diploma#degree
 
FiveHypotheses_UIDMasterclass_18April2024.pdf
FiveHypotheses_UIDMasterclass_18April2024.pdfFiveHypotheses_UIDMasterclass_18April2024.pdf
FiveHypotheses_UIDMasterclass_18April2024.pdf
 
Design principles on typography in design
Design principles on typography in designDesign principles on typography in design
Design principles on typography in design
 
Architecture case study India Habitat Centre, Delhi.pdf
Architecture case study India Habitat Centre, Delhi.pdfArchitecture case study India Habitat Centre, Delhi.pdf
Architecture case study India Habitat Centre, Delhi.pdf
 
西北大学毕业证学位证成绩单-怎么样办伪造
西北大学毕业证学位证成绩单-怎么样办伪造西北大学毕业证学位证成绩单-怎么样办伪造
西北大学毕业证学位证成绩单-怎么样办伪造
 

Implementation of Marathi Language Speech Databases for Large Dictionary

  • 1. IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 5, Issue 6, Ver. I (Nov -Dec. 2015), PP 40-45 e-ISSN: 2319 – 4200, p-ISSN No. : 2319 – 4197 www.iosrjournals.org DOI: 10.9790/4200-05614045 www.iosrjournals.org 40 | Page Implementation of Marathi Language Speech Databases for Large Dictionary Sangramsing Kayte1 , Monica Mundada1 , 2 Dr. Charansing Kayte 1 Department of Computer Science and Information Technology Dr. Babasaheb Ambedkar Marathwada University, Aurangabad 2 Department of Digital and Cyber Forensic, Aurangabad Maharashtra, India Abstract: In this research paper, we discuss our efforts in the development of Marathi language speech databases in Marathi for building large vocabulary. We have collected speech data from about 5 speakers in these one languages. We discuss the design and methodology of collection of speech databases. We also present preliminary speech recognition results using the acoustic models created on these databases using Sphinx 2 and Festvox speech tool kit. I. Introduction Most of the Data in digital world is accessible to a few who can read or understand a particular language. Language technologies can provide solutions in the form of natural interfaces so that digital content can reach to the masses and facilitate the exchange of information across different people speaking different languages. These technologies play a crucial role in multi-lingual societies such as India which has about 1652 dialects/native languages. While Hindi written in Devanagari script, is the official language, the other 17 languages recognized by the constitution of India are: 1) Assamese 2) Tamil 3) Malayalam 4) Gujarati 5) Telugu 6) Oriya 7) Urdu 8) Bengali 9) Sanskrit 10) Kashmiri 11) Sindhi 12) Punjabi 13) Konkani 14) Marathi 15) Manipuri 16) Kannada and 17) Nepali. Seamless integration of speech recognition, machine translation and speech synthesis systems could facilitate the exchange of information between two people speaking two different languages. Our overall goal is to develop speech recognition and speech synthesis systems for most of these languages [1]. In the context of developing large vocabulary continuous speech recognition systems, there has been effort by IBM to build a large vocabulary continuous speech recognition system in Hindi [2] by bootstrapping existing English acoustic models. A more recent development was a Hindi recognition system from HP Labs, India [3] which involved Hindi speech corpus collection and subsequent system building. In this paper, we discuss the development of speech databases in Telugu, Tamil and Marathi languages for building large vocabulary speech recognition systems. We also discuss some preliminary speech recognition results obtained on these databases. We hope that this work will serve as a benchmark and impetus for the development of large speech corpora and large vocabulary continuous speech recognition systems in other Indian languages II. Text Corpora for Speech Databases The first step we followed in creating a speech database for building an Automatic Speech Recognizer (ASR) is the generation of an optimal set of textual sentences to be recorded from the native speakers of the language [3]. The selected sentences should be minimal in number to save on manual recording effort and at the same time have enough occurrences of each type of sounds to capture all types of coarticulation effects in the chosen language. In this section, the various stages involved in the generation of the optimal text are described. 2.1.Text Corpus Collection To select a set of phonetically rich sentences, one of the important decisions to be made is the choice of a huge text corpus source from which the optimal sub-set has to be extracted. The reliability and coverage of the optimal text and of the language model largely depends on the quality of the text corpus chosen. The corpus should be unbiased and large enough to convey the entire syntactic behavior of the language. We have collected the text corpora for the Marathi languages we are working on, from commonly available error free linguistic text corpora of the languages wherever available in sufficient size. We have mostly used the CIIL Corpus [4] for all the languages. In case of Marathi where the available linguistic corpus contained a lot of errors, we have manually corrected some part of the corpora and also collected representative corpus of the language by crawling content from websites of sakel, Lokamet newspapers of the respective languages. The table below shows the size of the corpora and some other statistics [5].
  • 2. Implement of Marathi Language Speech Databases for Large DOI: 10.9790/4200-05614045 www.iosrjournals.org 41 | Page Table 1: Analysis of the text Corpora Language No. of Sentences No. of Unique Words No. of Words Marathi 102521 133273 1037647 2.1.1. Font Converters To ease the analysis and work with the text corpus, it is required that the text is in a representation that can be easily processed. Normally the text of a language from any web sources will be available in a specific font. So, font converters have to be made to convert the corpus into the desired electronic character code. The font will be reflective of the language and will have as many unique codes for each of the characters in the language. It is necessary to ensure that there are no mapping and conversion errors. 2.2.Grapheme to phoneme Converters The Text corpus has to be phonetized so that the distribution of the basic recognition units, the phones, diphones, syllables, etc in the text corpus can be analyzed. Phonetizers or the converters are the tools that convert the text corpus into its phonetic equivalent. Phonetizers are also required for generating the phonetic lexicon, an important component of ASR systems [3]. The lexicon is a representation of each entry of the vocabulary word set of the system in its phonetic form. The lexicon dictates to the decoder, the phonetic composition, or the pronunciation of an entry. The process of corpora phonetization or the development of phonetic lexicons for the western languages is traditionally done by linguists. These lexicons are subject to constant refinement and modification. But the phonetic nature of Indian scripts reduces the effort to building mere mapping tables and rules for the lexical representation. These rules and the mapping tables together comprise the phonetizers or the Grapheme to Phoneme converters. The Telugu and the Marathi were derived from the HP Labs Hindi [6]. Special ligatures have been used to denote nasalization, homoorganic nasals and dependent vowels. Two rules were added to address the issues specific to Marathi. A rule was added to nasalize the consonant that follows an anuswara. Another rule was added to substitute the Hindi chandrabindu with the consonant n. The Tamil G2P is a one to one mapping table. Certain exceptions arise in the case of the vowels oa and au vowels where the vowel comes before the consonant in the orthography but the pronunciation has the vowel sound following the consonant. Such combinations have been identified and have been split accordingly so as to ensure the one to one correspondence between the pronunciation and the phonetic lexicon [7] [8]. 2.3.Optimal Text Selection As mentioned in the earlier sections, the optimal set of sentences is extracted from the phonetized corpus using an Optimal Text Selectionsystem which selects the sentences which are phonetically rich. These sentences are distributed among the intended number of speakers such that good speaker variability is achieved in the database. Also it is attempted to distribute the optimal text among speakers such that each speaker gets to speak as many phonetic contexts as possible, The best-known OTS algorithm is the greedy algorithm as applied to set covering problem [4]. In the greedy approach, a sentence is selected from the corpus if it best satisfies a desired criterion. In general, the criterion for the optimal text selection would be the maximum number of diphones or a highest score computed by a scoring function based on the type of the diphones that the sentence is composed of. A threshold based approach has been implemented for OTS of the Marathi language ASRs. This algorithm performs as well as the classical greedy algorithm in a significantly lesser time. A threshold is maintained on the number of new diphones to be covered in a new sentence. A set of sentences is thus selected in each iteration. The optimal text, hence selected, has a full coverage of the diphones present in the corpus. These sentences are distributed among the speakers [9]. The table 2 gives the di-phone coverage statistics of the optimal texts of each language. Table 2: Di-phone coverage in the Marathilanguages [9] Language Phones Diphones covered Marathi 74 1779 III. Speech Data Collection In this research section, the various steps difficult in building the speech corpus are detailed. Firstly, the recording media is chosen so as to capture the effects due to channel and microphone variations. For the databases that were built for the Marathi language ASRs, the speech data was recorded over calculated number of landline and cellular phones using a multi-channel computer telephony interface card [5] [9].
  • 3. Implement of Marathi Language Speech Databases for Large DOI: 10.9790/4200-05614045 www.iosrjournals.org 42 | Page 3.1.Speaker Selection Speech data is collected from the native speakers of the language who were comfortable in speaking and reading the language. The speakers were chosen such that all the diversities attributing to the gender, age and dialect are sufficiently captured. The recording is clean and has minimal background disturbance. Any mistakes made while recording have been undone by re-recording or by making the corresponding changes in the transcription set. The various statistics of the data collected for the Indian Language ASRs are given in the next subsection [10]. 3.2.Data Statistics Speakers from various parts of the respective states (regions) were carefully recorded in order to cover all possible dialectic variations of the language. Each speaker has recorded 52 sentences of the optimal text. Table 3 gives the number of speakers recorded in each language and in each of the recording modes landline and cellphones. To capture different microphonic variations, four different cellphones were used while recording the speakers. Table 4 gives the age wise distribution of the speakers in the Marathi languages. Table 5 shows the gender wise distribution of the speakers in each of the Marathi languages [9]. Table 3: Number of speakers in the Marathi languages Language Landline Cellphone Total Marathi 92 84 176 Table 4: Age wise speaker distribution Language 20- 25 25-30 30-35 35-40 > 40 Marathi 77 56 26 12 5 Table 5: Gender distribution among speakers 3.3.Transcription Correction Despite the care taken to record the speech with minimal background noise and mistakes in pronunciation, some errors have crept in while recording. These errors had to be identified manually by listening to the speech. If felt unsuitable, some of the utterances have been discarded.In the case of the data collected in the Marathi languages, the transcriptions were manually edited and ranked based on the goodness of the speech recorded. The utterances were classified as Good, With Channel distortion, With Background Noiseand Uselesswhichever is true. The pronunciation mistakes were carefully identified and if possible the corresponding changes were made in the transcriptions so that the utterance and the transcription correspond to each other. The idea behind the classification was to make the utmost utilization of the data and to serve as a corpus for further related research work. IV. Building ASR systems Typically, ASR systems comprise three major constituents- the acoustic models, the language models and the phonetic lexicon. In the subsections that follow, each of these components is discussed. 4.1. Acoustic Model The first key constituent of the ASR systems is the acoustic model. Acoustic models capture the characteristics of the basic recognition units [3]. The recognition units can be at the word level, syllable level and or at the phoneme level. Many constraints and inadequacies come into picture with the selection of each of these units. For large vocabulary continuous speech recognition systems (LVCSR), phoneme is the preferred unit. Neural networks (NN) and Hidden Markov models are the most commonly used for acoustic modeling of ASR systems. We have chosen the Semi-Continuous Hidden Markov models (SCHMMs) [1] to represent context dependent phones (trip-hones). At the time of recognition, various words are hypothesized against the speech signal. To compute the likelihood of a word, the lexicon is referred and the word is broken into its constituent phones. The phone likelihood is computed from the HMMs. The combined likelihood of all the phones represents the likelihood of the word in the acoustic model. The tri-phone acoustic models built for the Indian language ASRs are 5 state SCHMMs with states clustered using decision trees. The models were built using CMU Sphinx II trainer. Two Language Male Female Marathi 91 85
  • 4. Implement of Marathi Language Speech Databases for Large DOI: 10.9790/4200-05614045 www.iosrjournals.org 43 | Page different models were trained from Landline and Cellphone data. The training data was carefully selected to have an even distribution across all the possible variations. For each of the systems, the speech data collected was divided into trainingand the testsets based upon the demographic details. 70% of the speakers were included in the training set and the rest 30% was used to test the respective systems. 4.2. Language Model The language model attempts to convey the behavior of the language. By the means of probabilistic information, the language model provides the syntax of the language. It aims to predict the occurrence of specific word sequences possible in the language. From the perspective of the recognition engine, the language model helps narrow down the search space for a valid combination of words [3]. Language models help guide and constrain the search among alternative word hypotheses during decoding. Most ASR systems use the stochastic language models (SLM). Speech recognizers seek the word sequence Wb which is most likely to be produced from the acoustic evidence A. P (Wb | A) = maxw P (W|A) = maxw P (A|W) P (W)/P (A) P (A), the probability of acoustic evidence is normally omitted as it is common for any word sequence. The recognized word sequence is the word combination W b which maximizes the above equation. LMs assign a probability estimate P (W) to word sequences W={w 1,w2, wn}. These probabilities can be trained from a corpus. Since there is only a finite coverage of word sequences, some smoothing has to be done to discount the frequencies to those of the lower order. SLMs use the N-gram LM where it is assumed that the probability of occurrence of a word is dependent only on the past N-1 words. The trigram model [N=3], based on the previous two words is particularly powerful, as most words have a strong dependence on the previous two words and it can be estimated reasonably well with an attainable corpus. The trigram based language model with back-off was used for recognition. The LM was created using the CMU statistical LM toolkit [11]. For the baseline systems developed, the LM was generated from the training and the testing transcripts. The unigrams of the language model were taken as the lexical entries for the systems. V. Baseline System Performance This section presents the performance of the recognition systems. Two systems were built in each language using the landline and the mobile speech data separately. The tests on these recognition systems were done using the speech data of the untrained 30% of the speakersdata. Goodutterances of these speakers were decoded using the Sphinx II decoder. Appropriate tuning was done on the decoder to get the best performance. The evaluation of the experiment was made according to the recognition accuracy and computed using the word error rate [WER] metric which aligns a recognized word string against the correct word string and computes the number of substitutions (S), deletions (D) and Insertions (I) and the number of words in the correct sentence (N). W.E.R = 100 * (S+D+I) / N In the following subsections, the gradual improvement in the performance of the ASRs is shown. The subsection 5.1 discusses the improvements obtained on iterations of Forced alignment. The subsection 5.2 deals with the use of an extended phoneme set. 5.1.Forced Alignments This is a well-known technique to improve the HMM of the Acoustic Model. The technique sounds quite simple but the performance of the ASR dramatically improves by nearly 10% at times. The method involves insertion of silence at appropriate places in the training transcription and retraining the models. Using existing acoustic models, the silence is detected by aligning the speech signal against the trained acoustic models and the corresponding transcription. The new transcriptions have silence markers at appropriate places, the intermediate tied states and hence the overall models are improved. Multiple Forced alignment iterations could be done to further improve the model. The performance eventually converges after a few iterations. The table 6 shows the improvement in the performance of the recognition systems built from the landline data of each language. The vocabulary sizes of the various systems built is shown in the table 7. Table 6: Improvements in the WER of the landline recognition systems through a couple of Forced Alignment iterations. Language Initial Performance 1st iteration 2nd iteration Marathi 27.3 24 23.2 Table 7 Vocabulary size of each ASR (Number of lexical entries) ASR System Vocabulary Marathi 21640
  • 5. Implement of Marathi Language Speech Databases for Large DOI: 10.9790/4200-05614045 www.iosrjournals.org 44 | Page 5.2.Significance of Extended Phone Set As mentioned earlier, lexicon defines the pronunciation of each lexical entry. The performance of the ASR system depends on the phone set chosen. New sub word units can be identified and be introduced into the phone set. The choice, however, should be such that, the variability of the phones and the complexity of the HMM is reduced. It is also to be ensured that there are sufficient instances of all the phones. The phonetic split in the lexicon will change accordingly as does the phone set. The Initial experiments were done using the standard ISCII phone sets of the Marathi languages [12]. To improve the phone set, we have added the geminates or the stressed consonants as standard phones. Stressed consonants, though represented as repetition of a consonant twice in the orthography, do not follow the same pattern in pronunciation [3]. The consonant is instead stressed, attributing some amount of plosiveness to it. The spectrogram of a consonant uttered twice would display two similar observations coarticulated due to one another. The spectrogram of a stressed consonant, on the other hand, would vividly display a blob of energy due to the stress. This clearly indicates that stressed consonants need to be treated separately and not as variants of the parent consonant for acoustic modeling. The table 8 shows the improved performances of all systems employing an extended phoneset. It also shows the size of the lexicon of the systems. Table 8: Improvements in %WER by including the stressed consonants. ASR System Initial Improved Vocabulary Marathi Landline 23.2 20.7 21640 Marathi Mobile 25.0 23.6 18912 VI. Speech to-Speech Application The speech-speech demo was an attempt to develop an interesting application using the above systems and integrating it with existing systems. It is a tourist aid application with a 145 word vocabulary and 1000 legitimate sentence templates within the tourist domain. The application converts a spoken source language query into the target language utterance. A template based language translation module has been simulated to convert the recognized text query into the target language. The Telugu and the Tamil Text-to-Speech (TTS) [13] [14] engines were integrated to synthesize and speak out the translated query in the target language. Figure1: Schematic representation of the Application As illustrated in the schematic representation, the application has three components, the recognition module (ASR), the translation module and the synthesis module (TTS). The Speech query of the user is taken by the ASR and is decoded into a text query of the source language. This goes as an input to the simulated language translation system. This identifies the template of the source language text query and converts it into the corresponding template of the target language. The TTS engine converts the output into spoken speech query of the target language. With the limitation of the available ASR and TTS engines, we could render the translation within the following source-target language pairs. (i) Marathi to Hindi (ii) Hindi to Marathi The finite size of the legitimate utterances has been exploited, giving a significant bias to the language model. Given minimal background disturbances, the performance of the application was close to 100%.
  • 6. Implement of Marathi Language Speech Databases for Large DOI: 10.9790/4200-05614045 www.iosrjournals.org 45 | Page VII. Conclusion and Future work In this research paper, we discussed the optimal design and development of speech databases for Marathi languages. We hope the simple methodology of database creation presented will serve as catalyst for the creation of speech databases in all other Indian languages. We also created speech recognizers and presented preliminary results on these databases. We hope the ASRs created will serve as baseline systems for further research on improving the accuracies in each of the languages. Our future work is focused in tuning these models and test them using language models built using a larger corpus. References [1]. Sangramsing Kayte, Dr. Bharti Gawali “Marathi Speech Synthesis: A review” International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 6 3708 – 3711 [2]. Sangramsing Kayte, Monica Mundada, Dr. CharansingKayte "A Review of Unit Selection Speech Synthesis International Journal of Advanced Research in Computer Science and Software Engineering -Volume 5, Issue 10, October-2015 [3]. Singh, S. P., et al Building Large Vocabulary Speech Recognition Systems for Indian Languages , International Conference on Natural Language Processing, 1:245-254, 2004. [4]. http://tdil.mit.gov.in/corpora/ach-corpora.htm#tech Marathi CIIL corpus. [5]. Sangramsing Kayte, Monica Mundada "Study of Marathi Phones for Synthesis of Marathi Speech from Text" International Journal of Emerging Research in Management &Technology ISSN: 2278-9359 (Volume-4, Issue-10) October 2015 [6]. Kalika Bali, ParthaPratimTalukdar, , Tools for the development of a Hindi Speech Synthesis System, 5th ISCA Speech Synthesis Workshop, Pittsburgh, pp.109- 114,2004 [7]. Monica Mundada, Bharti Gawali, Sangramsing Kayte "Recognition and classification of speech and its related fluency disorders" International Journal of Computer Science and Information Technologies (IJCSIT) [8]. Monica Mundada, Sangramsing Kayte, Dr. Bharti Gawali "Classification of Fluent and Dysfluent Speech Using KNN Classifier" International Journal of Advanced Research in Computer Science and Software Engineering Volume 4, Issue 9, September 2014 [9]. Sangramsing Kayte, Monica Mundada, Dr. CharansingKayte "Di-phone-Based Concatenative Speech Synthesis System for Hindi" International Journal of Advanced Research in Computer Science and Software Engineering -Volume 5, Issue 10, October-2015 [10]. Sangramsing Kayte, Monica Mundada, Santosh Gaikwad, Bharti Gawali "PERFORMANCE EVALUATION OF SPEECH SYNTHESIS TECHNIQUES FOR ENGLISH LANGUAGE " International Congress on Information and Communication Technology 9-10 October, 2015 [11]. Rosenfeld Roni, CMU statistical Language Modelling (SLM) Toolkit (Version 2). [12]. IS13194:1991, Indian Script Code for Information Interchange ISCII-91, Bureau of Indian Standards, April,1999 [13]. Rohit Kumar et al. "Unit Selection Approach for building Indian Language TTS", Interspeech 2004, [14]. Sangramsing N.kayte “Marathi Isolated-Word Automatic Speech Recognition System based on Vector Quantization (VQ) approach” 101th Indian Science Congress Jammu University 03th Feb to 07 Feb 2014.