SlideShare una empresa de Scribd logo
1 de 8
Descargar para leer sin conexión
International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064
Volume 2 Issue 9, September 2013
www.ijsr.net
Applying Rule-Based Maximum Matching
Approach for Verb Phrase Identification and
Translation (Myanmar to English)
Soe, Thae Thae1
, Thida, Aye2
1
University of Computer Studies, Mandalay, Myanmar
2
University of Computer Studies, Mandalay, Myanmar
Abstract: Phrase Identification is one of the most critical and widely studied in Natural Language processing (NLP) tasks. Verb
Phrase Identification within a sentence is very useful for a variety of application on NLP. One of the core enabling technologies
required in NLP applications is a Morphological Analysis. This paper presents the Myanmar Verb Phrase Identification and
Translation Algorithm and develops a Markov Model with Morphological Analysis. The system is based on Rule-Based Maximum
Matching Approach. In Machine Translation, Large amount of information is needed to guide the translation process. Myanmar
Language is inflected language and there are very few creations and researches of Lexicon in Myanmar, comparing to other language
such as English, French and Czech etc. Therefore, this system is proposed Myanmar Verb Phrase identification and translation model
based on Syntactic Structure and Morphology of Myanmar Language by using Myanmar- English bilingual lexicon. Markov Model is
also used to reformulate the translation probability of Phrase pairs. Experiment results showed that proposed system can improve
translation quality by applying morphological analysis on Myanmar Language.
Keywords: Myanmar verb phrase identification and translation, morphological analysis, Rule-Based Maximum Matching
1. Introduction
Language plays an important role in human communication
because it is used as a channel not only for expressing
thoughts but also for exchanging information. In the age of
Information Technology, The Internet has become a primary
source for people to exchange their thoughts and information
.It is simply and convenience for all people around the word.
However, they have difficult to communicate among them
because of different their native languages. Some people are
familiar with two or more kind of Languages, spoken and
written languages but most are not. Due to these difficulties
and increased use of network, there is an increased need for
language translation to facilitate among people in
communication, publication and learning subjects. Attempts
of language translation are almost as old as computer
themselves. Machine Translation (MT) is the attempt to
automate all or part of the process of translation between
human languages and is one of the oldest large-scale
applications of computer science. Developing a system that
accurately produces a good translation between human
languages is the goal of MT system.
Human Language translation is a difficult task for natural
language because there has language ambiguity and it varies
according to their features and nature. Myanmar word
transformations are similar to other Asian Language
including Indian, Japanese, Thai and Chinese Language. The
problem of Machine Translation can be view as consisting of
three phrases (i) analysis of the source language to choose
appropriate target language lexical item (words or phrases) ,
(ii) reordering phrase where the chosen target language
lexical items are reordering to produce a meaningful target
language sentence and (iii) disambiguation of words senses
where the correct meaning of words is chosen for
translation.
The Myanmar-English MT system is developed by
composing two main modules which are identification and
translation. First, module, identify the Myanmar Verb Phrase
from input of Myanmar Sentence. And then, second module,
translate the Myanmar Verb Phrase into English Verb Phrase
using Myanmar English Bilingual Lexicon. Each step in
Machine Translation process is hard technical problem, to
which the best known solutions are either not adequate, or
good enough only in narrow application domains, falling
when applied to other domains. The proposed system is
concentrated on improving one of these two steps, namely
identification and translation, while having in mind that
some of the core techniques can be applied to other parts of a
Machine Translation (MT).
There are many research fields in Natural Language
Processing system and Machine Translation System. There is
no one who has developed complete Machine Translation
System for Myanmar to English language. Therefore, this
research aims to emphasize and develop the identification
and translation of Myanmar verb phrase which is a part of
Myanmar-English Machine Translation System.
Paper ID: 06091303 90
International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064
Volume 2 Issue 9, September 2013
www.ijsr.net
The rest of this paper is organized as follows: In section 2,
previous works in phrase identification for machine
translation is presented. Section 3, presents Nature of
Myanmar Language and Myanmar sentence structure. The
proposed system is presented in section 4. Section 5
presented types of Myanmar Verb Phrase and section 6
described morphological analysis for Myanmar Verb Phrase.
Finally section 7, 8, 9 and .10 discusses about the results and
error analysis of proposed system and conclusion.
2. Related Work
In this section, previous works in the structure of verb
Phrase identification and machine translation on different
language are reviewed. Various researchers have improved
the quality of machine translation system by using different
methods on different language. Wajid Ali et al proposed the
structure of Urdu verb phrases, and detail a series of
experiment to automatically tag them. A 100,000 words
Urdu corpus is manually tagged with VP chunk tags. The
corpus is then used to develop a hybrid approach using
HMM based statistical chunking and correction rules [11].
The technique is enhanced by changing chunking direction
and merging chunk and POS tags. . Kim, Changhyun et.al
described [5] Korean-Chinese machine translation system.
This system includes source language pattern part for
analysis and a target language pattern part for generation.
Basically used Pattern-based knowledge and translates
Korean Verb Phrase into Chinese Verb Phrase. M. Selvame
et al presented an improvement of Rule Based morphological
Analysis and POS Tagging in Tamil Language [8] via
Projection and Induction Techniques. Rule based approach is
applicable to the languages which have well defined set of
rules to accommodate most of the words with inflectional
and derivational morphology. Fridah Katushemererwr [1]
demonstrated the application of finite state approach in the
analysis of Runyakitara verb morphology. Language specific
knowledge and insight have been applied to classify and
describe the morphological structure of the language, and
quasi context-free and rewriting rules have been written to
account for grammatical verbs of Runyakitara.
In 2005 Goldwater and McClosky [2] used morphological
analysis of Czech to improve a Czech-English statistical
machine translation system. This system solve data sparse
problem caused by the highly inflected nature of Czech.
Their combine model achieved high BLEU score of
development and test set. Nguyen and Shimazu, [9]
proposed morphological transformational rules and Bayes’
formula based transformational model to translate English to
Vietnamese. The score of their system is better than baseline
score. Kamaijeetkaur Batra and GS Lehal, [6] presented
rule-based machine translation of Noun Phrase from Punjabi
to English. The system use transfer approach. The system
had analysis, translation and synthesis component. In 2004,
Koehn [4] suggested using features of lexical weighting. In
this year, the famous phrase-bassed decoder, Pharaoh, was
released to be a free SMT toolkit by Philipp Koehn and
further updated to Mosses by Koehn et al, 2007. In 2006,
Narayan Kumar Choudhary [10] presented about the
Developing a Computational Framework for the Verb
Morphology of Great Andamanese.
An ideal system for machine translation would take
advantage of both empirical data and linguistic analysis.
Different authors have different objectives that they attempt
to achieve high translation precision on many languages. Our
phrase identification and translation model aims to get
correct translation phrases with very limited bilingual
lexicon for Myanmar to English machine translation.
3. Nature of Myanmar Language
The Myanmar Language is the official language if Myanmar.
It is also the native language of the Myanmar and related
sub- ethnic groups of the Myanmar, as well as that of some
ethnic minorities in Myanmar like the Mon. Myanmar
Language is spoken by 32 million as a first language and as
a second language by 10 million, particularly ethnic
minorities in Myanmar and those in neighboring countries.
Myanmar Language is a tonal and pitch-register, largely
monosyllabic and analytic language, with a Subject Object
Verb (SOV) word order. The language uses the Myanmar
script, derived from the Old Mon Script and ultimately from
the Brahmi script.
The language is classified into two categories. One is formal,
used in literary works, official publications, radio broadcasts,
and formal speeches. The other is colloquial, used in daily
conversation and spoken. This is reflected in the Myanmar
words for “languge”: စာ refers to written, literary language,
and စကား refers to spoken language. Therefore, Myanmar
language can mean either written Myanmar language or
spoken Myanmar Language.
စာအုပ္္ စားပြဲ ေပၚမွာ ရွိိတယ္။ (spoken language)
စာအုပ္္သည္ စားပြဲ ေပၚတြင္ ရွိိသည္။ (formal language)
3.1 Myanmar Sentence Structure
There are two kinds of sentences according to the syntactic
structure of Myanmar language. They are simple sentence
and complex sentence. Figure1: shows the syntactic structure
of Myanmar language.
Figure 1: Syntactic Structure
Paper ID: 06091303 91
International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064
Volume 2 Issue 9, September 2013
www.ijsr.net
3.1.1 Simple Sentence
The simple sentences are declarative, negative, and
interrogative. It contains only one clause. There are two
basic phrases such as subject phrase and verb phrase in a
simple sentence.
For example:
သူ (Subject phrase) အိပ္ေနသည္ (Verb phrase)
However, a simple sentence can be constructed by only one
phrase. This phrase may be verb phrase or noun phrase.
For example:
စားသြားသည္ (Verb phrase)
Besides, a simple sentence can be constructed by two or
three phrases.
For example:
ရန္ကုန္တြင္(Place phrase) ေနသည္ (Verb phrase)
Myanmar phrases can be written in any order as long as the
verb phrase is at the end of the sentence.
For example:
ဦးဘသည္ မနၱေလးမွ ၿပန္လာသည္ (Subject, Place, Verb)
မနၱေလးမွ ဦးဘသည္ ၿပန္လာသည္ (Place, Subject, Verb)
A simple sentence can be extended by placing many other
phrases between subject phrase and verb phrase. All of the
following are simple sentences, because each contains only
one clause. It can be quite long.
For example:
ဦးဘသည္ မနၱေလးမွ ရန္ကုန္သို့ မီးရထားၿဖင့္ ၿပန္လာသည္။
(U Ba comes back from Mandalay to Yangon by
train.)
It is also constructed by adding noun phrases such as subject
phrase, object phrase, time phrase and verb phrase. These
added noun phrases are called emphatic phrases.
For example:
ပါေမာကၡ ဦးဘသည္ သား ေမာင္ေမာင္ႏွင့္အတူ အထက္ မႏၱေလးမွ
ၿမိဳ႕ေတာ္ ရန္ကုန္သို႔ အျမန္ မီးရထားျဖင့္ မေန႔ နံနက္က
ေခ်ာေခ်ာေမာေမာ ျပန္လာသည္။
Professor U Ba and his son Mg Mg came back safely from
upper Mandalay to capital Yangon by express train in
yesterday morning.
3.1.2 Complex Sentence
A complex sentence consists of two or more independent
clauses (or simple sentences) joined by postpositions,
particles or conjunctions. There are at least two verbs or
more than two verbs in a complex sentence. There are two
kinds of clause in a complex sentence called independent
clause(IC) and dependent clause (DC). DC is in front of IC.
A complex sentence contains one independent clause and at
least one dependent clause. DC is the same as IC but it must
contain a clause marker (CM) in the end. A clause maker
may be post positions, particles or conjunctions. There are
three dependent clauses depending on the clause marker.
(1)Noun DC (joined by postpositions such as မွာ၊က၊ကို)
မမ ေစ်းသို႔ သြားသည္ ကို ကၽြန္မ ျမင္သည္။
I see that Ma Ma goes to the market.
Noun DC : မမ ေစ်းသို႔ သြားသည္ ကို
IC : ကၽြန္မ ျမင္သည္။
(2)Adjective DC (joined by particles such as ေသာ ၊ သည္ ့၊
မည့္)
မမ ေပးေသာ စာအုပ္ ကို ကၽြန္မ ဖတ္သည္။
I read the book that is given by Ma Ma.
Adjective DC :မမ ေပးေသာ (စာအုပ္)
IC :စာအုပ္ ကို ကၽြန္မ ဖတ္သည္။
(3)Adverb DC (joined by conjunctions such as ေသာေၾကာင့့္ ၊
လ်က္ ၊ သျဖင့္)
မိုးရြာေန ေသာေၾကာင့္ ကၽြန္မေစ်းသို႔ မသြားပါ။
I do not go to the market because it is raining.
Adverb DC : မိုးရြာေန ေသာေၾကာင့္
IC:ကၽြန္မေစ်းသို႔မသြားပါ။
3.1.3 Negative Sentence
Generally the negative sentence is ending with “ပါ” and its
roots word has prefix “မ” such as “မ…… ပါ”. It also
depends on the tense type and modality. For example:
(i) သူသည္ (Subject Phrase) ေက်ာင္းသို့(Noun Phrase) မသြားပါ။
(Verb Phrase)
He doesn’ t go to school.
(ii) လွလွသည္ (Noun Phrase) ဒီေန႔ လာလိမ့္မည္ မဟုတ္ပါ။
(Verb Phrase)
Hla Hla will not come today?
(iii) စာအုပ္သည္ (Noun Phrase) မထူပါ။ (Verb Phrase)
This book is not thick.
Normally, negative meaning of verb is adding prefix “မ” in
front of the root verb word. But some verbs have non-linear
structure such as “work”. This positive meaning is
“အလုပ္လုပ္”, the negative meaning is “အလုပ္မလုပ္” . In this
case “မ” is placed within the root words.
Paper ID: 06091303 92
International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064
Volume 2 Issue 9, September 2013
www.ijsr.net
3.1.4 Interrogative Sentence
There are two types of questions, yes/no question. Yes/no
questions area mentioned in auxiliary verb. In wh-questions,
the WH feature identifies the class of Phrase which is
signaled by words such as who, what, when, where, why and
how (as in how many, how much, how careful). These words
fall in several different categories, who, whom, and what can
appear as pronouns and can be used to specify simple NPs,
what and which appear as determines in NPs, where and
when appear as prepositional phrases, how acts as an
adverbial modifier to adjective and adverbial phrases and
whose acts as a possessive pronoun. The wh-words also can
act in different roles such as relative clause. In Myanmar
Language question ending format is fixed. The suffix if the
yes/no question is “လား” and wh-question is “လဲ” “နည္း”.
For example:
(i) မင္းဘုရားပြဲကို (Subject phrase) သြားမလား။ (Verb
Phrase)
Will you go to Pagoda festival?
(ii)
မင္းစာေမးပြဲ (Subject Phrase) ေအာင္သလား။ (Verb Phrase)
Do you pass the exam?
(iii) ဤခရီး (Subject Phrase) နီးသလား။ (Verb Phrase)
Is this trip is near?
4. The Proposed System
In Natural Language Processing, some results have already
been obtained, however, a number of important research
problems have not been solved yet. This section explains the
details of Myanmar Verb Phrase identification and
translation process by using Rule-Based Maximum Matching
Approach. This process accepts the segmented Myanmar
words with parts of speech to the system (example
ေမာင္ေမာင္ / NPR /ေက်ာင္း /NCCS/ သို႔ / PODIR
/လ်င္ၿမန္စြာ/ ADVM /ခပ္သုပ္သုပ္ /ADVM/ သြား/ HV/ ေန /
PAVPC/ သည္/ POVP). The longest maximum matching
method scans the input text by sequentially reading each
word from the input text and match the predefined Myanmar
Grammar Rule. To identify the Myanmar Verb Phrase,
firstly extract the root verb in a given sentences and then,
consider the morphological analysis of prefix, suffix and
tense particle of the root verb. Finally, the system translates
the Myanmar Verb Phrase into English words by using
Myanmar-English Bilingual Lexicon. The system’s output is
လ်င္ၿမန္စြာ ခပ္သုတ္သုတ္သြားေနသည္။
Figure 2: Overview of Proposed System
5. Verb Phrase in Myanmar Language
Verb Phrase consists of some adverbial modifiers followed
by the head verb or root verb and its complements. Every
verb must appear in one of the five possible forms: base,
simple present, simple past tense, present participle and past
participle. The auxiliary and modal verbs usually take a verb
phrase as a complement, which produces a sequence of verbs
to form a tense system.
The root form of be and have and the modal auxiliary such
as present and past forms of do(did), can(coruld),
may(might), shall(should),will(would), must, need and dare
are the auxiliary verbs. In this case, “be” and “have” can be
either auxiliary or main verb. These two forms are separate
properties. The auxiliary be requires a present –participle
form or in the case of passive form (past-participle form) of
verb phrase to follow it, whereas the verb be requires a noun
phrase complement or preposition phrase or adjective phrase
or adverb phrase. The auxiliary have requires a noun phrase
complement. English sentences typically contain a sequence
of auxiliary verbs followed by a main verb. Auxiliary verbs
can be used in declarative sentence, negative sentences and
yes/no questions. The structure of Myanmar verb phrase is:
ဦးေဆာင္ၾကိယာ (Root Verb) + ၾကိယာဝိဘတ္ (Verb
Preposition) [3].
Example: သြား သည္။ (go)
Myanmar Verb Phrase can be divided into two types:
I. အေၿခခံၾကိယာပုဒ္ (Basic MyanmarVerb Phrase)
II. တိုးခ်ဲံ့ၾကိယာပုဒ္(Extended Myanmar Verb Phrase)
Paper ID: 06091303 93
International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064
Volume 2 Issue 9, September 2013
www.ijsr.net
5.1 Basic Myanmar Verb Phrase
The basic Verb Phrase consists of a Root Verb and Verb
Preposition. The Root Verb may be either Action or State or
Compound Verb.
Example: ေမာင္ေမာင္ေက်ာင္းသြားသည္။
အေၿခခံၾကိယာပုဒ္ ဦးေဆာင္ၾကိယာ+ၾကိယာ၀ိဘတ္
BV HV +POVP
သြား သည္။ (go)
5.2 Extended Myanmar Verb Phrase
Extended Verb Phrase is based on basic Verb Phrase and it
is extended with verb modifiers. There are four types of
extended Verb Phrase.
တိုးခ်ဲ့ၾကိယာပုဒ္ ၾကိယာအထူးၿပဳ+/- ေလးအနက္ၿ႔ပဳ+/-
အၿငင္း၀ိဘတ္+ ဦးေဆာင္ၾကိယာ+/- ၾကိယာေထာက္
တစ္ခု/တစ္ခုထက္ပုိ + ၾကိယာ၀ိဘတ္ တစ္ခု/တစ္ခုထက္ပို
Extended Myanmar Verb Phrase type (1)
In extended verb phrase type (1), one or more adverbs are
before the head verb and one or more verb prepositions are
after the head verb.
တုိးခ်ဲ့ၾကိယာပုဒ္ ၾကိယာအထူးၿပဳ + ဦးေဆာင္ၾကိယာ+
ၾကိယာ၀ိဘတ္
EVP ADV + HV + POVP
ေခ်ာေခ်ာေမာေမာ ေရာက္ သည္။(arrive at safely)
Extended Myanmar Verb Phrase type (2)
In extended verb phrase type (2), one or more verb particles
and verb propositions are after the head verb.
တုိးခ်ဲ့ၾကိယာပုဒ္ ဦးေဆာင္ၾကိယာ+ၾကိယာေထာက္ပစၥည္း+
ၾကိယာ၀ိဘတ္
EVP HV + PAVP+POVP
စားေနသည္။ ( is eating)
Extended Myanmar Verb Phrase type (3)
According to the extended verb phrase type (3), the negative
particle can be included before the head verb. If verb particle
is after the head verb, the negative particles may be between
the head verb and verb particle. Then, one or more verb
prepositions can be following.
တုိးခ်ဲ့ၾကိယာပုဒ္ အၿငင္း၀ိဘတ္+ ဦးေဆာင္ၾကိယာ+/-
ၾကိယာေထာက္ တစ္ခု/တစ္ခုထက္ပုိ + ၾကိယာ၀ိဘတ္
တစ္ခု/တစ္ခုထက္ပို
EVP PANEG +HV+ PAVPS +/-POVP
မ အိပ္ ခ်င္ေသး ဘူး။ (don’t want to sleep)
Extended Myanmar Verb Phrase type (4)
In extended verb phrase type (4), one or more verb modifiers
are before the head verb and one or more verb preposition
are after the head verb.
တိုးခ်ဲ့ၾကိယာပုဒ္ အေလးအနက္ၿပဳ+ ဦးေဆာင္ၾကိယာ+
ၾကိယာ၀ိဘတ္ တစ္ခု/တစ္ခုထက္ပို
EVP ADVM + HV+ POVP
အရမ္း ေကာင္းသည္။ (is very good)
6. Translation with Morphological Analysis
for Myanmar Verb Phrase
In Myanmar, Verb does not change its form based o the
gender of the subject/object rather it changes with respect to
tense, aspect, modality and number only. Including different
spelling, there are 38 inflected forms of the root verb in
Myanmar. Table 4.1 list the tense suffixes for these different
forms. As stated before, Myanmar Verb morphology has
some non-linear characteristics. Often, the root changes its
form when certain suffixes are added to it based on tenses
and on many occasion, it varies non-linear. For example, the
verb စား (eat) when followed by suffix “ေန သည္” (present
continuous), become “စားေနသည္” , whereas when followed
by the suffix “ခဲ့ သည္” (simple past tense) becomes “ စားခဲ့
သည္” , the suffixes “ၿပီးၿပီ” ( future tense ) becomes
စားၿပီးၿပီ . The negative meaning of prefix “မ” becomes
(does/do not work) “မစားးပါ”, suiffix “ၾက” (plural of subject)
becomes “စားၾကသည္” respectively. Similarly, the verb
“အလုပ္လုပ္” (work) when followed by suffix nay (present
continuous) becomes “အလုပ္လုပ္ေနသည္”. But, the
negative meaning of prefix ma becomes (does/do not work)
“အလုပ္မလုပ္ပါ”. Thus, the addition of the prefix “မ”
changes the root forms of “ အလုပ္လုပ”္ to “အလုပ္မလုပ္” ,
which is an indication of non linearity.
Myanmar verb can be divided into three main categories:
Individual Verb, Compound Verb and Adjective Verb. For
example: individual verb: စားသည္ ‘eat’; compound verb:
ေျပးဖက္သည္ ‘run and hug’; Adjective Verb: ေပ်ာ္သည္ pw-ti
‘is happy’. Some verbs can be used to support other verbs.
For example: ေျပာသည္ ‘tell’ and ေပးသည္ ‘give’ are
individual verbs and can be used as main verbs in sentences.
But in this verb ေျပာေပးသည္ ‘tell’, ေပး ‘give’ is not the main
Verb Phrase
Basic Verb Phrase Extended Verb
Phrase
Extended
verb Phrase
type (1)
Extended
verb Phrase
type (2)
Extended
verb Phrase
type (3)
Extended
verb Phrase
type (4)…
Paper ID: 06091303 94
International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064
Volume 2 Issue 9, September 2013
www.ijsr.net
verb. It behaves particle to support the main verb ေျပာ ‘tell’.
More than two individual English verbs can include in
Myanmar compound verb. For example: three individual
verbs: ၾကြေရာက္ ‘come’, အားေပး ‘encourage’, ခ်ီးျမွင္ ‘award’
include in compound verb : ၾကြေရာက္အားေပးခ်ီးျမွင့္သည္ ‘come
and encourage and award’. ‘ၾကြေရာက္အားေပးခ်ီးျမွင့္သည္’ is
Myanmar Compound verb. It has three English individual
verbs “come, encourage and award”. Verb particle ၾက t can
be omitted in the sentence. For example: ေက်ာင္းသားမ်ား
ကစားေန ၾကသည္။ ‘Students are playing.’ And ေက်ာင္းသားမ်ား
ကစားေန သည္။ ‘Students are playing’. Compound Verbs pose
special problems to the robustness of a translation method,
because the word itself must be represented in the training
data: the occurrence of each of the components is not
enough.
7. Markov Model
Markov Model has been widely used in several of Natural
Language Processing tasks. (such as POS tagging, Spell
Checking, Machine Translation, Automatic Text
Summarization, Information Retrieval (IR), Automatic Text
Extraction and so on. This system developed a Markov
Model to identify Myanmar Verb Phrases based on
predefined Myanmar Grammar Rule-Based Maximum
Matching Approach of totally 200 rules. This model
constructed both Simple and complex sentences of nearly
2000 sentences.
Figure 3: Markov Model for MVP
PANG={မ}
V= {သြား, စား ,အိပ္ ,ခ်က္ၿပဳတ္, ကန္ ္, ခြဲ,..}
C= { ၍,နွင့္..}
POREP= {တြင္, မွာ , နဲ႔,…}
ADV= {လ်င္ၿမန္စြာ, ခင္ခင္မင္မင္,အေၿပးအလြား,… }
PAVP= {ခဲ့ ,လိမ့္, ေန,…}
POVP= {သည္ ,မည္ ,ဘူးလား ,ဘူး ,ပါ ,ဧ။္,..}
8. Algorithm for Myanmar Verb Phrase
Identification
Input: A= {word1, word2… wordn}// Set Segmented words with
Part of Speech (Myanmar Sentence.)
Output: Myanmar Verb Phrase into English Proper English Verb
Phrase //Translate Verb Phrase using Myanmar- English Bilingual
Lexicon.
Begin
Steps:
1. Read input sentence A.
2. Set i =0;
3. Input [i] =A. next token ();//Read input sentence A and
tokenized the words by “/” and set to array [i].
4. For(s=0; s<=i: s++)
4.1 Find VAC, VST or VCP from Input [i].// where VAC is
Act
On verb, VST is State Verb and VCP is Compound Verb.
4.2 If (input[i] = = “VAC” ||input [i] ==“VST” ||input[i] =
=“VCP”) then
k=s; // set k to s
EndIf
End//for
if (input[k+2] == “ POVP” || input[k+4] = =“POVP” )then
{Input[k] = “HV”;
Identify myanmar verb phrase;
}
EndIf
ENDIF
5. DISPLAY MYANMAR VERB PHRASE
END.
9. Experimental Results
The proposed system, there are nearly 2000 training
sentences and 1500 testing sentences. Myanmar 3 font is
used for Myanmar Language. The sentences consist of 5 to
35 words. We divided sentences into simple sentences and
complex sentences. The simple sentences are declarative,
negative and interrogative. Three types of complex sentences
are joined with particles, adjective and adverb respectively.
The accuracy of verb phrase identification is calculated by
using well-known measure precision; recall and F-measure
in equation (1), (2) and (3).This system ignore the words
order. We have a little limitation in some simple and
complex sentences.
POREP
C
PANG
PAVP
POVP
ADV
V
Paper ID: 06091303 95
International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064
Volume 2 Issue 9, September 2013
www.ijsr.net
Table 1: Evaluation Results for Verb Phrase
Identification
0
0.2
0.4
0.6
0.8
1
1.2
% for Simple and complex 
sentences
Simple sentence
Complex sentence
Table 2: Evaluation results for Verb Phrase identification
Type of
sentences
No of
Sentences
Precisio
n
Recal
l
F-
measured
Simple
Sentences
1000 0.97 0.83 0.89
Complex
Sentences
500 0.94 0.66 0.77
9.1 Error Analysis
Errors in proposed system are as follow. Compound verb has
two meaning. သြားခဲ့ and စားခဲ့သည္ and meaning of
သြားစားခဲ့သည္ဲ့ is (went and ate). Although our system can
translate it as သြားသည္: go) and (စားခဲ့သည္ ate), we have
difficulty to translate (သြားစားခဲ့သည္) : went and ate) to get
correct translation. Some verb support to previous verb:
ေၿပာေပးသည္ give), correct translation is “talk”. Beside then
in the negative inflection of verb has more error because
negative particle of Myanmar “မ” can take as prefix or
middle of stem verb such as (“မေၿပာဘူး”:not tell) and
(“”နားမေထာင္ဘူး:not listen). In the latter case
(“နားမေထာင္ဘူး”) is analyzed as “နား” and “မေထာင္ဘူး”
which as (ear and not stand).
In adjective, we have same error like negative verb inflection
like (ရိုေသ respectful) of negative form as (“မရိုေသ”: not
respectful) or (“မရိုမေသ”: not respectful). Although the word
of “မရုိေသ” is not problem in analyzer, the word “မရိုမေသ”
has error occurs.
10. Conclusion
In Natural Language Processing, Phrase identification is one
of the most critical and widely used as research area. Verb
Phrase identification within a sentence is very useful for a
variety of application in Natural Language Processing
(NLP). In this paper, Myanmar Verb Phrase identification
Algorithm is proposed by developing a Markov Model to
show statistical results. In experimental result, a proposed
algorithm shows the efficient results with precision, recall
and F-measure in simple sentences and complex sentences.
As a future work, after identifying the Myanmar Verb
Phrase, translates to English Verb Phrase by using
Myanmar-English Bilingual Lexicon. The design and
algorithm of the Myanmar Verb Phrase identification and
translation system developed in this research can be
extended in further research directions in the fields of NLP
and IR such as text categorization, document summarization,
question answering, query processing and document ranking
in search engine development etc.
References
[1] Fridah Katushemererwr et al, “Finite State Methods in
Morphological Analysis of Runyakitara Verbs” Nordic
Journal of African Studies.
[2] Goldwater Sharon and McClosky David, 2005,
Improving Statistical MT through Morphological
Analysis. Proceedings of Human Language Technology
Conference and Conference on Empirical Methods in
Natural Language Processing, pages 676-683,
Vancouver , October
[3] J.Okeli, a.Allot, “Burmesse/Myanmar dictionary of
grammatical forms”.
[4] Koehn , P.F.J. Och et al, “ Statistical Phrase-Based
Translation”, Processing of the 2007 Joint Conference
on Empirical Methods in Natural Language Processing
and Computational Natural Language Learning, PP868-
876, Pragun.
[5] Kim, Changhyun et al, “Verb Pattern Based Korean-
Chinese Machine Translation System”
[6] KamaijeeKaur Batra and GS Lehal, “Rule-Based
Machine Translation of Noun Phrase from Punjabi to
English”, International Journal of Computer Sciences
Issue, 2010
[7] Md. Musfique Anwaar, Mohammad Aabed Anwar et al
“Syntax Analysis and Machine Translation of Bangla
Sentences”, Dept. of Computer Science & Engineering,
Jahangirnagar University, Bangladesh
[8] M. Selvam et al “ Improvement of Rule-Based
Morphological Analysis and POS Tagging in Tamil
Language via Projection and Induction Techniques."
INTERNATIONAL JOURNAL OF COMPUTERS,
Issue 4, Volume 3, 2009
[9] Nguyen et al, “Improving Phrase-Based SMT with
Morpho-Syntactic Analysis and Transformation.
Proceeding of the conference on Empirical Method in
Natural Language Processing and Very Large Corpora,
University of Maryland, College Park, MD, pp 20-28
[10] N. K. Choudhary, “Developing a Computational
Framework for the Verb Morphology of Great
Andamanese”, Centre for Linguistics in India, JNU,
2006.
[11] Wajid Ali et al, “A hybrid approach to Urdu Verb
Phrase Chunking.”, Department of the Myanmar
Language Commission, Ministry of Education, Union
of Myanmar 2005
Author Profile
Thae Thae Soe received the B.C.Sc. and
M.C.Sc. degrees in University of Computer
Studies, Mandalay, Myanmar in 2004 and 2008,
respectively. I am also an assistance lecturer and
Paper ID: 06091303 96
International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064
Volume 2 Issue 9, September 2013
www.ijsr.net
a Ph.D candidate of University of Computer Studies,
Mandalay. My research field is Natural Language Processing
(NLP). I am very interested in NLP.
Paper ID: 06091303 97

Más contenido relacionado

La actualidad más candente

Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...
Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...
Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...IJECEIAES
 
HINDI AND MARATHI TO ENGLISH MACHINE TRANSLITERATION USING SVM
HINDI AND MARATHI TO ENGLISH MACHINE TRANSLITERATION USING SVMHINDI AND MARATHI TO ENGLISH MACHINE TRANSLITERATION USING SVM
HINDI AND MARATHI TO ENGLISH MACHINE TRANSLITERATION USING SVMijnlc
 
Tamil-English Document Translation Using Statistical Machine Translation Appr...
Tamil-English Document Translation Using Statistical Machine Translation Appr...Tamil-English Document Translation Using Statistical Machine Translation Appr...
Tamil-English Document Translation Using Statistical Machine Translation Appr...baskaran_md
 
Tamil Morphological Analysis
Tamil Morphological AnalysisTamil Morphological Analysis
Tamil Morphological AnalysisKarthik Sankar
 
A Corpus-Based Concatenative Speech Synthesis System for Marathi
A Corpus-Based Concatenative Speech Synthesis System for MarathiA Corpus-Based Concatenative Speech Synthesis System for Marathi
A Corpus-Based Concatenative Speech Synthesis System for Marathiiosrjce
 
Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid Approach
Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid ApproachPunjabi to Hindi Transliteration System for Proper Nouns Using Hybrid Approach
Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid ApproachIJERA Editor
 
Myanmar named entity corpus and its use in syllable-based neural named entity...
Myanmar named entity corpus and its use in syllable-based neural named entity...Myanmar named entity corpus and its use in syllable-based neural named entity...
Myanmar named entity corpus and its use in syllable-based neural named entity...IJECEIAES
 
Survey on Indian CLIR and MT systems in Marathi Language
Survey on Indian CLIR and MT systems in Marathi LanguageSurvey on Indian CLIR and MT systems in Marathi Language
Survey on Indian CLIR and MT systems in Marathi LanguageEditor IJCATR
 
A New Approach to Parts of Speech Tagging in Malayalam
A New Approach to Parts of Speech Tagging in MalayalamA New Approach to Parts of Speech Tagging in Malayalam
A New Approach to Parts of Speech Tagging in Malayalamijcsit
 
Machine Translation Approaches and Design Aspects
Machine Translation Approaches and Design AspectsMachine Translation Approaches and Design Aspects
Machine Translation Approaches and Design AspectsIOSR Journals
 
Cross language information retrieval in indian
Cross language information retrieval in indianCross language information retrieval in indian
Cross language information retrieval in indianeSAT Publishing House
 
Quality estimation of machine translation outputs through stemming
Quality estimation of machine translation outputs through stemmingQuality estimation of machine translation outputs through stemming
Quality estimation of machine translation outputs through stemmingijcsa
 
Hindi –tamil text translation
Hindi –tamil text translationHindi –tamil text translation
Hindi –tamil text translationVaibhav Agarwal
 
S URVEY O N M ACHINE T RANSLITERATION A ND M ACHINE L EARNING M ODELS
S URVEY  O N M ACHINE  T RANSLITERATION A ND  M ACHINE L EARNING M ODELSS URVEY  O N M ACHINE  T RANSLITERATION A ND  M ACHINE L EARNING M ODELS
S URVEY O N M ACHINE T RANSLITERATION A ND M ACHINE L EARNING M ODELSijnlc
 
Grapheme-To-Phoneme Tools for the Marathi Speech Synthesis
Grapheme-To-Phoneme Tools for the Marathi Speech SynthesisGrapheme-To-Phoneme Tools for the Marathi Speech Synthesis
Grapheme-To-Phoneme Tools for the Marathi Speech SynthesisIJERA Editor
 
Natural Language Processing: State of The Art, Current Trends and Challenges
Natural Language Processing: State of The Art, Current Trends and ChallengesNatural Language Processing: State of The Art, Current Trends and Challenges
Natural Language Processing: State of The Art, Current Trends and Challengesantonellarose
 
A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES
A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES
A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES ijnlc
 

La actualidad más candente (19)

Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...
Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...
Improving accuracy of part-of-speech (POS) tagging using hidden markov model ...
 
HINDI AND MARATHI TO ENGLISH MACHINE TRANSLITERATION USING SVM
HINDI AND MARATHI TO ENGLISH MACHINE TRANSLITERATION USING SVMHINDI AND MARATHI TO ENGLISH MACHINE TRANSLITERATION USING SVM
HINDI AND MARATHI TO ENGLISH MACHINE TRANSLITERATION USING SVM
 
Tamil-English Document Translation Using Statistical Machine Translation Appr...
Tamil-English Document Translation Using Statistical Machine Translation Appr...Tamil-English Document Translation Using Statistical Machine Translation Appr...
Tamil-English Document Translation Using Statistical Machine Translation Appr...
 
Tamil Morphological Analysis
Tamil Morphological AnalysisTamil Morphological Analysis
Tamil Morphological Analysis
 
A Corpus-Based Concatenative Speech Synthesis System for Marathi
A Corpus-Based Concatenative Speech Synthesis System for MarathiA Corpus-Based Concatenative Speech Synthesis System for Marathi
A Corpus-Based Concatenative Speech Synthesis System for Marathi
 
Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid Approach
Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid ApproachPunjabi to Hindi Transliteration System for Proper Nouns Using Hybrid Approach
Punjabi to Hindi Transliteration System for Proper Nouns Using Hybrid Approach
 
Myanmar named entity corpus and its use in syllable-based neural named entity...
Myanmar named entity corpus and its use in syllable-based neural named entity...Myanmar named entity corpus and its use in syllable-based neural named entity...
Myanmar named entity corpus and its use in syllable-based neural named entity...
 
Survey on Indian CLIR and MT systems in Marathi Language
Survey on Indian CLIR and MT systems in Marathi LanguageSurvey on Indian CLIR and MT systems in Marathi Language
Survey on Indian CLIR and MT systems in Marathi Language
 
A New Approach to Parts of Speech Tagging in Malayalam
A New Approach to Parts of Speech Tagging in MalayalamA New Approach to Parts of Speech Tagging in Malayalam
A New Approach to Parts of Speech Tagging in Malayalam
 
Machine Translation Approaches and Design Aspects
Machine Translation Approaches and Design AspectsMachine Translation Approaches and Design Aspects
Machine Translation Approaches and Design Aspects
 
Cross language information retrieval in indian
Cross language information retrieval in indianCross language information retrieval in indian
Cross language information retrieval in indian
 
Quality estimation of machine translation outputs through stemming
Quality estimation of machine translation outputs through stemmingQuality estimation of machine translation outputs through stemming
Quality estimation of machine translation outputs through stemming
 
**JUNK** (no subject)
**JUNK** (no subject)**JUNK** (no subject)
**JUNK** (no subject)
 
Hindi –tamil text translation
Hindi –tamil text translationHindi –tamil text translation
Hindi –tamil text translation
 
Cf32516518
Cf32516518Cf32516518
Cf32516518
 
S URVEY O N M ACHINE T RANSLITERATION A ND M ACHINE L EARNING M ODELS
S URVEY  O N M ACHINE  T RANSLITERATION A ND  M ACHINE L EARNING M ODELSS URVEY  O N M ACHINE  T RANSLITERATION A ND  M ACHINE L EARNING M ODELS
S URVEY O N M ACHINE T RANSLITERATION A ND M ACHINE L EARNING M ODELS
 
Grapheme-To-Phoneme Tools for the Marathi Speech Synthesis
Grapheme-To-Phoneme Tools for the Marathi Speech SynthesisGrapheme-To-Phoneme Tools for the Marathi Speech Synthesis
Grapheme-To-Phoneme Tools for the Marathi Speech Synthesis
 
Natural Language Processing: State of The Art, Current Trends and Challenges
Natural Language Processing: State of The Art, Current Trends and ChallengesNatural Language Processing: State of The Art, Current Trends and Challenges
Natural Language Processing: State of The Art, Current Trends and Challenges
 
A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES
A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES
A COMPREHENSIVE ANALYSIS OF STEMMERS AVAILABLE FOR INDIC LANGUAGES
 

Destacado

SeemakurtyEtAlHCOMP2010
SeemakurtyEtAlHCOMP2010SeemakurtyEtAlHCOMP2010
SeemakurtyEtAlHCOMP2010Jonathan Chu
 
Knowledge Extraction and Linked Data: Playing with Frames
Knowledge Extraction and Linked Data: Playing with FramesKnowledge Extraction and Linked Data: Playing with Frames
Knowledge Extraction and Linked Data: Playing with FramesValentina Presutti
 
grammaticality, deep & surface structure, and ambiguity
grammaticality, deep & surface structure, and ambiguitygrammaticality, deep & surface structure, and ambiguity
grammaticality, deep & surface structure, and ambiguityDedew Deviarini
 

Destacado (7)

Phrases
PhrasesPhrases
Phrases
 
SeemakurtyEtAlHCOMP2010
SeemakurtyEtAlHCOMP2010SeemakurtyEtAlHCOMP2010
SeemakurtyEtAlHCOMP2010
 
Presentation1
Presentation1Presentation1
Presentation1
 
Knowledge Extraction and Linked Data: Playing with Frames
Knowledge Extraction and Linked Data: Playing with FramesKnowledge Extraction and Linked Data: Playing with Frames
Knowledge Extraction and Linked Data: Playing with Frames
 
KNN
KNN KNN
KNN
 
grammaticality, deep & surface structure, and ambiguity
grammaticality, deep & surface structure, and ambiguitygrammaticality, deep & surface structure, and ambiguity
grammaticality, deep & surface structure, and ambiguity
 
Algorithme knn
Algorithme knnAlgorithme knn
Algorithme knn
 

Similar a Applying Rule-Based Maximum Matching Approach for Verb Phrase Identification and Translation (Myanmar to English)

ATTENTION-BASED SYLLABLE LEVEL NEURAL MACHINE TRANSLATION SYSTEM FOR MYANMAR ...
ATTENTION-BASED SYLLABLE LEVEL NEURAL MACHINE TRANSLATION SYSTEM FOR MYANMAR ...ATTENTION-BASED SYLLABLE LEVEL NEURAL MACHINE TRANSLATION SYSTEM FOR MYANMAR ...
ATTENTION-BASED SYLLABLE LEVEL NEURAL MACHINE TRANSLATION SYSTEM FOR MYANMAR ...ijnlc
 
HANDLING CHALLENGES IN RULE BASED MACHINE TRANSLATION FROM MARATHI TO ENGLISH
HANDLING CHALLENGES IN RULE BASED MACHINE TRANSLATION FROM MARATHI TO ENGLISHHANDLING CHALLENGES IN RULE BASED MACHINE TRANSLATION FROM MARATHI TO ENGLISH
HANDLING CHALLENGES IN RULE BASED MACHINE TRANSLATION FROM MARATHI TO ENGLISHijnlc
 
A New Approach: Automatically Identify Naming Word from Bengali Sentence for ...
A New Approach: Automatically Identify Naming Word from Bengali Sentence for ...A New Approach: Automatically Identify Naming Word from Bengali Sentence for ...
A New Approach: Automatically Identify Naming Word from Bengali Sentence for ...Syeful Islam
 
RBIPA: An Algorithm for Iterative Stemming of Tamil Language Texts
RBIPA: An Algorithm for Iterative Stemming of Tamil Language TextsRBIPA: An Algorithm for Iterative Stemming of Tamil Language Texts
RBIPA: An Algorithm for Iterative Stemming of Tamil Language Textskevig
 
RBIPA: An Algorithm for Iterative Stemming of Tamil Language Texts
RBIPA: An Algorithm for Iterative Stemming of Tamil Language TextsRBIPA: An Algorithm for Iterative Stemming of Tamil Language Texts
RBIPA: An Algorithm for Iterative Stemming of Tamil Language Textskevig
 
IRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language Models
IRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language ModelsIRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language Models
IRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language ModelsIRJET Journal
 
Myanmar news summarization using different word representations
Myanmar news summarization using different word representations Myanmar news summarization using different word representations
Myanmar news summarization using different word representations IJECEIAES
 
Error Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation OutputsError Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation OutputsParisa Niksefat
 
Parsing of Myanmar Sentences With Function Tagging
Parsing of Myanmar Sentences With Function TaggingParsing of Myanmar Sentences With Function Tagging
Parsing of Myanmar Sentences With Function Taggingkevig
 
PARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGING
PARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGINGPARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGING
PARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGINGkevig
 
PARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGING
PARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGINGPARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGING
PARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGINGkevig
 
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...iosrjce
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARijcseit
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARijcseit
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARijcseit
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARijcseit
 
5215ijcseit01
5215ijcseit015215ijcseit01
5215ijcseit01ijcsit
 
A Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to MarathiA Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to Marathiaciijournal
 

Similar a Applying Rule-Based Maximum Matching Approach for Verb Phrase Identification and Translation (Myanmar to English) (20)

ATTENTION-BASED SYLLABLE LEVEL NEURAL MACHINE TRANSLATION SYSTEM FOR MYANMAR ...
ATTENTION-BASED SYLLABLE LEVEL NEURAL MACHINE TRANSLATION SYSTEM FOR MYANMAR ...ATTENTION-BASED SYLLABLE LEVEL NEURAL MACHINE TRANSLATION SYSTEM FOR MYANMAR ...
ATTENTION-BASED SYLLABLE LEVEL NEURAL MACHINE TRANSLATION SYSTEM FOR MYANMAR ...
 
HANDLING CHALLENGES IN RULE BASED MACHINE TRANSLATION FROM MARATHI TO ENGLISH
HANDLING CHALLENGES IN RULE BASED MACHINE TRANSLATION FROM MARATHI TO ENGLISHHANDLING CHALLENGES IN RULE BASED MACHINE TRANSLATION FROM MARATHI TO ENGLISH
HANDLING CHALLENGES IN RULE BASED MACHINE TRANSLATION FROM MARATHI TO ENGLISH
 
A New Approach: Automatically Identify Naming Word from Bengali Sentence for ...
A New Approach: Automatically Identify Naming Word from Bengali Sentence for ...A New Approach: Automatically Identify Naming Word from Bengali Sentence for ...
A New Approach: Automatically Identify Naming Word from Bengali Sentence for ...
 
RBIPA: An Algorithm for Iterative Stemming of Tamil Language Texts
RBIPA: An Algorithm for Iterative Stemming of Tamil Language TextsRBIPA: An Algorithm for Iterative Stemming of Tamil Language Texts
RBIPA: An Algorithm for Iterative Stemming of Tamil Language Texts
 
RBIPA: An Algorithm for Iterative Stemming of Tamil Language Texts
RBIPA: An Algorithm for Iterative Stemming of Tamil Language TextsRBIPA: An Algorithm for Iterative Stemming of Tamil Language Texts
RBIPA: An Algorithm for Iterative Stemming of Tamil Language Texts
 
IRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language Models
IRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language ModelsIRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language Models
IRJET- Tamil Speech to Indian Sign Language using CMUSphinx Language Models
 
Pxc3898474
Pxc3898474Pxc3898474
Pxc3898474
 
ReseachPaper
ReseachPaperReseachPaper
ReseachPaper
 
Myanmar news summarization using different word representations
Myanmar news summarization using different word representations Myanmar news summarization using different word representations
Myanmar news summarization using different word representations
 
Error Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation OutputsError Analysis of Rule-based Machine Translation Outputs
Error Analysis of Rule-based Machine Translation Outputs
 
Parsing of Myanmar Sentences With Function Tagging
Parsing of Myanmar Sentences With Function TaggingParsing of Myanmar Sentences With Function Tagging
Parsing of Myanmar Sentences With Function Tagging
 
PARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGING
PARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGINGPARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGING
PARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGING
 
PARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGING
PARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGINGPARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGING
PARSING OF MYANMAR SENTENCES WITH FUNCTION TAGGING
 
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
Artificially Generatedof Concatenative Syllable based Text to Speech Synthesi...
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
 
5215ijcseit01
5215ijcseit015215ijcseit01
5215ijcseit01
 
A Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to MarathiA Novel Approach for Rule Based Translation of English to Marathi
A Novel Approach for Rule Based Translation of English to Marathi
 

Más de International Journal of Science and Research (IJSR)

Más de International Journal of Science and Research (IJSR) (20)

Innovations in the Diagnosis and Treatment of Chronic Heart Failure
Innovations in the Diagnosis and Treatment of Chronic Heart FailureInnovations in the Diagnosis and Treatment of Chronic Heart Failure
Innovations in the Diagnosis and Treatment of Chronic Heart Failure
 
Design and implementation of carrier based sinusoidal pwm (bipolar) inverter
Design and implementation of carrier based sinusoidal pwm (bipolar) inverterDesign and implementation of carrier based sinusoidal pwm (bipolar) inverter
Design and implementation of carrier based sinusoidal pwm (bipolar) inverter
 
Polarization effect of antireflection coating for soi material system
Polarization effect of antireflection coating for soi material systemPolarization effect of antireflection coating for soi material system
Polarization effect of antireflection coating for soi material system
 
Image resolution enhancement via multi surface fitting
Image resolution enhancement via multi surface fittingImage resolution enhancement via multi surface fitting
Image resolution enhancement via multi surface fitting
 
Ad hoc networks technical issues on radio links security &amp; qo s
Ad hoc networks technical issues on radio links security &amp; qo sAd hoc networks technical issues on radio links security &amp; qo s
Ad hoc networks technical issues on radio links security &amp; qo s
 
Microstructure analysis of the carbon nano tubes aluminum composite with diff...
Microstructure analysis of the carbon nano tubes aluminum composite with diff...Microstructure analysis of the carbon nano tubes aluminum composite with diff...
Microstructure analysis of the carbon nano tubes aluminum composite with diff...
 
Improving the life of lm13 using stainless spray ii coating for engine applic...
Improving the life of lm13 using stainless spray ii coating for engine applic...Improving the life of lm13 using stainless spray ii coating for engine applic...
Improving the life of lm13 using stainless spray ii coating for engine applic...
 
An overview on development of aluminium metal matrix composites with hybrid r...
An overview on development of aluminium metal matrix composites with hybrid r...An overview on development of aluminium metal matrix composites with hybrid r...
An overview on development of aluminium metal matrix composites with hybrid r...
 
Pesticide mineralization in water using silver nanoparticles incorporated on ...
Pesticide mineralization in water using silver nanoparticles incorporated on ...Pesticide mineralization in water using silver nanoparticles incorporated on ...
Pesticide mineralization in water using silver nanoparticles incorporated on ...
 
Comparative study on computers operated by eyes and brain
Comparative study on computers operated by eyes and brainComparative study on computers operated by eyes and brain
Comparative study on computers operated by eyes and brain
 
T s eliot and the concept of literary tradition and the importance of allusions
T s eliot and the concept of literary tradition and the importance of allusionsT s eliot and the concept of literary tradition and the importance of allusions
T s eliot and the concept of literary tradition and the importance of allusions
 
Effect of select yogasanas and pranayama practices on selected physiological ...
Effect of select yogasanas and pranayama practices on selected physiological ...Effect of select yogasanas and pranayama practices on selected physiological ...
Effect of select yogasanas and pranayama practices on selected physiological ...
 
Grid computing for load balancing strategies
Grid computing for load balancing strategiesGrid computing for load balancing strategies
Grid computing for load balancing strategies
 
A new algorithm to improve the sharing of bandwidth
A new algorithm to improve the sharing of bandwidthA new algorithm to improve the sharing of bandwidth
A new algorithm to improve the sharing of bandwidth
 
Main physical causes of climate change and global warming a general overview
Main physical causes of climate change and global warming   a general overviewMain physical causes of climate change and global warming   a general overview
Main physical causes of climate change and global warming a general overview
 
Performance assessment of control loops
Performance assessment of control loopsPerformance assessment of control loops
Performance assessment of control loops
 
Capital market in bangladesh an overview
Capital market in bangladesh an overviewCapital market in bangladesh an overview
Capital market in bangladesh an overview
 
Faster and resourceful multi core web crawling
Faster and resourceful multi core web crawlingFaster and resourceful multi core web crawling
Faster and resourceful multi core web crawling
 
Extended fuzzy c means clustering algorithm in segmentation of noisy images
Extended fuzzy c means clustering algorithm in segmentation of noisy imagesExtended fuzzy c means clustering algorithm in segmentation of noisy images
Extended fuzzy c means clustering algorithm in segmentation of noisy images
 
Parallel generators of pseudo random numbers with control of calculation errors
Parallel generators of pseudo random numbers with control of calculation errorsParallel generators of pseudo random numbers with control of calculation errors
Parallel generators of pseudo random numbers with control of calculation errors
 

Último

Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdfQucHHunhnh
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactPECB
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 

Último (20)

Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Beyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global ImpactBeyond the EU: DORA and NIS 2 Directive's Global Impact
Beyond the EU: DORA and NIS 2 Directive's Global Impact
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 

Applying Rule-Based Maximum Matching Approach for Verb Phrase Identification and Translation (Myanmar to English)

  • 1. International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064 Volume 2 Issue 9, September 2013 www.ijsr.net Applying Rule-Based Maximum Matching Approach for Verb Phrase Identification and Translation (Myanmar to English) Soe, Thae Thae1 , Thida, Aye2 1 University of Computer Studies, Mandalay, Myanmar 2 University of Computer Studies, Mandalay, Myanmar Abstract: Phrase Identification is one of the most critical and widely studied in Natural Language processing (NLP) tasks. Verb Phrase Identification within a sentence is very useful for a variety of application on NLP. One of the core enabling technologies required in NLP applications is a Morphological Analysis. This paper presents the Myanmar Verb Phrase Identification and Translation Algorithm and develops a Markov Model with Morphological Analysis. The system is based on Rule-Based Maximum Matching Approach. In Machine Translation, Large amount of information is needed to guide the translation process. Myanmar Language is inflected language and there are very few creations and researches of Lexicon in Myanmar, comparing to other language such as English, French and Czech etc. Therefore, this system is proposed Myanmar Verb Phrase identification and translation model based on Syntactic Structure and Morphology of Myanmar Language by using Myanmar- English bilingual lexicon. Markov Model is also used to reformulate the translation probability of Phrase pairs. Experiment results showed that proposed system can improve translation quality by applying morphological analysis on Myanmar Language. Keywords: Myanmar verb phrase identification and translation, morphological analysis, Rule-Based Maximum Matching 1. Introduction Language plays an important role in human communication because it is used as a channel not only for expressing thoughts but also for exchanging information. In the age of Information Technology, The Internet has become a primary source for people to exchange their thoughts and information .It is simply and convenience for all people around the word. However, they have difficult to communicate among them because of different their native languages. Some people are familiar with two or more kind of Languages, spoken and written languages but most are not. Due to these difficulties and increased use of network, there is an increased need for language translation to facilitate among people in communication, publication and learning subjects. Attempts of language translation are almost as old as computer themselves. Machine Translation (MT) is the attempt to automate all or part of the process of translation between human languages and is one of the oldest large-scale applications of computer science. Developing a system that accurately produces a good translation between human languages is the goal of MT system. Human Language translation is a difficult task for natural language because there has language ambiguity and it varies according to their features and nature. Myanmar word transformations are similar to other Asian Language including Indian, Japanese, Thai and Chinese Language. The problem of Machine Translation can be view as consisting of three phrases (i) analysis of the source language to choose appropriate target language lexical item (words or phrases) , (ii) reordering phrase where the chosen target language lexical items are reordering to produce a meaningful target language sentence and (iii) disambiguation of words senses where the correct meaning of words is chosen for translation. The Myanmar-English MT system is developed by composing two main modules which are identification and translation. First, module, identify the Myanmar Verb Phrase from input of Myanmar Sentence. And then, second module, translate the Myanmar Verb Phrase into English Verb Phrase using Myanmar English Bilingual Lexicon. Each step in Machine Translation process is hard technical problem, to which the best known solutions are either not adequate, or good enough only in narrow application domains, falling when applied to other domains. The proposed system is concentrated on improving one of these two steps, namely identification and translation, while having in mind that some of the core techniques can be applied to other parts of a Machine Translation (MT). There are many research fields in Natural Language Processing system and Machine Translation System. There is no one who has developed complete Machine Translation System for Myanmar to English language. Therefore, this research aims to emphasize and develop the identification and translation of Myanmar verb phrase which is a part of Myanmar-English Machine Translation System. Paper ID: 06091303 90
  • 2. International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064 Volume 2 Issue 9, September 2013 www.ijsr.net The rest of this paper is organized as follows: In section 2, previous works in phrase identification for machine translation is presented. Section 3, presents Nature of Myanmar Language and Myanmar sentence structure. The proposed system is presented in section 4. Section 5 presented types of Myanmar Verb Phrase and section 6 described morphological analysis for Myanmar Verb Phrase. Finally section 7, 8, 9 and .10 discusses about the results and error analysis of proposed system and conclusion. 2. Related Work In this section, previous works in the structure of verb Phrase identification and machine translation on different language are reviewed. Various researchers have improved the quality of machine translation system by using different methods on different language. Wajid Ali et al proposed the structure of Urdu verb phrases, and detail a series of experiment to automatically tag them. A 100,000 words Urdu corpus is manually tagged with VP chunk tags. The corpus is then used to develop a hybrid approach using HMM based statistical chunking and correction rules [11]. The technique is enhanced by changing chunking direction and merging chunk and POS tags. . Kim, Changhyun et.al described [5] Korean-Chinese machine translation system. This system includes source language pattern part for analysis and a target language pattern part for generation. Basically used Pattern-based knowledge and translates Korean Verb Phrase into Chinese Verb Phrase. M. Selvame et al presented an improvement of Rule Based morphological Analysis and POS Tagging in Tamil Language [8] via Projection and Induction Techniques. Rule based approach is applicable to the languages which have well defined set of rules to accommodate most of the words with inflectional and derivational morphology. Fridah Katushemererwr [1] demonstrated the application of finite state approach in the analysis of Runyakitara verb morphology. Language specific knowledge and insight have been applied to classify and describe the morphological structure of the language, and quasi context-free and rewriting rules have been written to account for grammatical verbs of Runyakitara. In 2005 Goldwater and McClosky [2] used morphological analysis of Czech to improve a Czech-English statistical machine translation system. This system solve data sparse problem caused by the highly inflected nature of Czech. Their combine model achieved high BLEU score of development and test set. Nguyen and Shimazu, [9] proposed morphological transformational rules and Bayes’ formula based transformational model to translate English to Vietnamese. The score of their system is better than baseline score. Kamaijeetkaur Batra and GS Lehal, [6] presented rule-based machine translation of Noun Phrase from Punjabi to English. The system use transfer approach. The system had analysis, translation and synthesis component. In 2004, Koehn [4] suggested using features of lexical weighting. In this year, the famous phrase-bassed decoder, Pharaoh, was released to be a free SMT toolkit by Philipp Koehn and further updated to Mosses by Koehn et al, 2007. In 2006, Narayan Kumar Choudhary [10] presented about the Developing a Computational Framework for the Verb Morphology of Great Andamanese. An ideal system for machine translation would take advantage of both empirical data and linguistic analysis. Different authors have different objectives that they attempt to achieve high translation precision on many languages. Our phrase identification and translation model aims to get correct translation phrases with very limited bilingual lexicon for Myanmar to English machine translation. 3. Nature of Myanmar Language The Myanmar Language is the official language if Myanmar. It is also the native language of the Myanmar and related sub- ethnic groups of the Myanmar, as well as that of some ethnic minorities in Myanmar like the Mon. Myanmar Language is spoken by 32 million as a first language and as a second language by 10 million, particularly ethnic minorities in Myanmar and those in neighboring countries. Myanmar Language is a tonal and pitch-register, largely monosyllabic and analytic language, with a Subject Object Verb (SOV) word order. The language uses the Myanmar script, derived from the Old Mon Script and ultimately from the Brahmi script. The language is classified into two categories. One is formal, used in literary works, official publications, radio broadcasts, and formal speeches. The other is colloquial, used in daily conversation and spoken. This is reflected in the Myanmar words for “languge”: စာ refers to written, literary language, and စကား refers to spoken language. Therefore, Myanmar language can mean either written Myanmar language or spoken Myanmar Language. စာအုပ္္ စားပြဲ ေပၚမွာ ရွိိတယ္။ (spoken language) စာအုပ္္သည္ စားပြဲ ေပၚတြင္ ရွိိသည္။ (formal language) 3.1 Myanmar Sentence Structure There are two kinds of sentences according to the syntactic structure of Myanmar language. They are simple sentence and complex sentence. Figure1: shows the syntactic structure of Myanmar language. Figure 1: Syntactic Structure Paper ID: 06091303 91
  • 3. International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064 Volume 2 Issue 9, September 2013 www.ijsr.net 3.1.1 Simple Sentence The simple sentences are declarative, negative, and interrogative. It contains only one clause. There are two basic phrases such as subject phrase and verb phrase in a simple sentence. For example: သူ (Subject phrase) အိပ္ေနသည္ (Verb phrase) However, a simple sentence can be constructed by only one phrase. This phrase may be verb phrase or noun phrase. For example: စားသြားသည္ (Verb phrase) Besides, a simple sentence can be constructed by two or three phrases. For example: ရန္ကုန္တြင္(Place phrase) ေနသည္ (Verb phrase) Myanmar phrases can be written in any order as long as the verb phrase is at the end of the sentence. For example: ဦးဘသည္ မနၱေလးမွ ၿပန္လာသည္ (Subject, Place, Verb) မနၱေလးမွ ဦးဘသည္ ၿပန္လာသည္ (Place, Subject, Verb) A simple sentence can be extended by placing many other phrases between subject phrase and verb phrase. All of the following are simple sentences, because each contains only one clause. It can be quite long. For example: ဦးဘသည္ မနၱေလးမွ ရန္ကုန္သို့ မီးရထားၿဖင့္ ၿပန္လာသည္။ (U Ba comes back from Mandalay to Yangon by train.) It is also constructed by adding noun phrases such as subject phrase, object phrase, time phrase and verb phrase. These added noun phrases are called emphatic phrases. For example: ပါေမာကၡ ဦးဘသည္ သား ေမာင္ေမာင္ႏွင့္အတူ အထက္ မႏၱေလးမွ ၿမိဳ႕ေတာ္ ရန္ကုန္သို႔ အျမန္ မီးရထားျဖင့္ မေန႔ နံနက္က ေခ်ာေခ်ာေမာေမာ ျပန္လာသည္။ Professor U Ba and his son Mg Mg came back safely from upper Mandalay to capital Yangon by express train in yesterday morning. 3.1.2 Complex Sentence A complex sentence consists of two or more independent clauses (or simple sentences) joined by postpositions, particles or conjunctions. There are at least two verbs or more than two verbs in a complex sentence. There are two kinds of clause in a complex sentence called independent clause(IC) and dependent clause (DC). DC is in front of IC. A complex sentence contains one independent clause and at least one dependent clause. DC is the same as IC but it must contain a clause marker (CM) in the end. A clause maker may be post positions, particles or conjunctions. There are three dependent clauses depending on the clause marker. (1)Noun DC (joined by postpositions such as မွာ၊က၊ကို) မမ ေစ်းသို႔ သြားသည္ ကို ကၽြန္မ ျမင္သည္။ I see that Ma Ma goes to the market. Noun DC : မမ ေစ်းသို႔ သြားသည္ ကို IC : ကၽြန္မ ျမင္သည္။ (2)Adjective DC (joined by particles such as ေသာ ၊ သည္ ့၊ မည့္) မမ ေပးေသာ စာအုပ္ ကို ကၽြန္မ ဖတ္သည္။ I read the book that is given by Ma Ma. Adjective DC :မမ ေပးေသာ (စာအုပ္) IC :စာအုပ္ ကို ကၽြန္မ ဖတ္သည္။ (3)Adverb DC (joined by conjunctions such as ေသာေၾကာင့့္ ၊ လ်က္ ၊ သျဖင့္) မိုးရြာေန ေသာေၾကာင့္ ကၽြန္မေစ်းသို႔ မသြားပါ။ I do not go to the market because it is raining. Adverb DC : မိုးရြာေန ေသာေၾကာင့္ IC:ကၽြန္မေစ်းသို႔မသြားပါ။ 3.1.3 Negative Sentence Generally the negative sentence is ending with “ပါ” and its roots word has prefix “မ” such as “မ…… ပါ”. It also depends on the tense type and modality. For example: (i) သူသည္ (Subject Phrase) ေက်ာင္းသို့(Noun Phrase) မသြားပါ။ (Verb Phrase) He doesn’ t go to school. (ii) လွလွသည္ (Noun Phrase) ဒီေန႔ လာလိမ့္မည္ မဟုတ္ပါ။ (Verb Phrase) Hla Hla will not come today? (iii) စာအုပ္သည္ (Noun Phrase) မထူပါ။ (Verb Phrase) This book is not thick. Normally, negative meaning of verb is adding prefix “မ” in front of the root verb word. But some verbs have non-linear structure such as “work”. This positive meaning is “အလုပ္လုပ္”, the negative meaning is “အလုပ္မလုပ္” . In this case “မ” is placed within the root words. Paper ID: 06091303 92
  • 4. International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064 Volume 2 Issue 9, September 2013 www.ijsr.net 3.1.4 Interrogative Sentence There are two types of questions, yes/no question. Yes/no questions area mentioned in auxiliary verb. In wh-questions, the WH feature identifies the class of Phrase which is signaled by words such as who, what, when, where, why and how (as in how many, how much, how careful). These words fall in several different categories, who, whom, and what can appear as pronouns and can be used to specify simple NPs, what and which appear as determines in NPs, where and when appear as prepositional phrases, how acts as an adverbial modifier to adjective and adverbial phrases and whose acts as a possessive pronoun. The wh-words also can act in different roles such as relative clause. In Myanmar Language question ending format is fixed. The suffix if the yes/no question is “လား” and wh-question is “လဲ” “နည္း”. For example: (i) မင္းဘုရားပြဲကို (Subject phrase) သြားမလား။ (Verb Phrase) Will you go to Pagoda festival? (ii) မင္းစာေမးပြဲ (Subject Phrase) ေအာင္သလား။ (Verb Phrase) Do you pass the exam? (iii) ဤခရီး (Subject Phrase) နီးသလား။ (Verb Phrase) Is this trip is near? 4. The Proposed System In Natural Language Processing, some results have already been obtained, however, a number of important research problems have not been solved yet. This section explains the details of Myanmar Verb Phrase identification and translation process by using Rule-Based Maximum Matching Approach. This process accepts the segmented Myanmar words with parts of speech to the system (example ေမာင္ေမာင္ / NPR /ေက်ာင္း /NCCS/ သို႔ / PODIR /လ်င္ၿမန္စြာ/ ADVM /ခပ္သုပ္သုပ္ /ADVM/ သြား/ HV/ ေန / PAVPC/ သည္/ POVP). The longest maximum matching method scans the input text by sequentially reading each word from the input text and match the predefined Myanmar Grammar Rule. To identify the Myanmar Verb Phrase, firstly extract the root verb in a given sentences and then, consider the morphological analysis of prefix, suffix and tense particle of the root verb. Finally, the system translates the Myanmar Verb Phrase into English words by using Myanmar-English Bilingual Lexicon. The system’s output is လ်င္ၿမန္စြာ ခပ္သုတ္သုတ္သြားေနသည္။ Figure 2: Overview of Proposed System 5. Verb Phrase in Myanmar Language Verb Phrase consists of some adverbial modifiers followed by the head verb or root verb and its complements. Every verb must appear in one of the five possible forms: base, simple present, simple past tense, present participle and past participle. The auxiliary and modal verbs usually take a verb phrase as a complement, which produces a sequence of verbs to form a tense system. The root form of be and have and the modal auxiliary such as present and past forms of do(did), can(coruld), may(might), shall(should),will(would), must, need and dare are the auxiliary verbs. In this case, “be” and “have” can be either auxiliary or main verb. These two forms are separate properties. The auxiliary be requires a present –participle form or in the case of passive form (past-participle form) of verb phrase to follow it, whereas the verb be requires a noun phrase complement or preposition phrase or adjective phrase or adverb phrase. The auxiliary have requires a noun phrase complement. English sentences typically contain a sequence of auxiliary verbs followed by a main verb. Auxiliary verbs can be used in declarative sentence, negative sentences and yes/no questions. The structure of Myanmar verb phrase is: ဦးေဆာင္ၾကိယာ (Root Verb) + ၾကိယာဝိဘတ္ (Verb Preposition) [3]. Example: သြား သည္။ (go) Myanmar Verb Phrase can be divided into two types: I. အေၿခခံၾကိယာပုဒ္ (Basic MyanmarVerb Phrase) II. တိုးခ်ဲံ့ၾကိယာပုဒ္(Extended Myanmar Verb Phrase) Paper ID: 06091303 93
  • 5. International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064 Volume 2 Issue 9, September 2013 www.ijsr.net 5.1 Basic Myanmar Verb Phrase The basic Verb Phrase consists of a Root Verb and Verb Preposition. The Root Verb may be either Action or State or Compound Verb. Example: ေမာင္ေမာင္ေက်ာင္းသြားသည္။ အေၿခခံၾကိယာပုဒ္ ဦးေဆာင္ၾကိယာ+ၾကိယာ၀ိဘတ္ BV HV +POVP သြား သည္။ (go) 5.2 Extended Myanmar Verb Phrase Extended Verb Phrase is based on basic Verb Phrase and it is extended with verb modifiers. There are four types of extended Verb Phrase. တိုးခ်ဲ့ၾကိယာပုဒ္ ၾကိယာအထူးၿပဳ+/- ေလးအနက္ၿ႔ပဳ+/- အၿငင္း၀ိဘတ္+ ဦးေဆာင္ၾကိယာ+/- ၾကိယာေထာက္ တစ္ခု/တစ္ခုထက္ပုိ + ၾကိယာ၀ိဘတ္ တစ္ခု/တစ္ခုထက္ပို Extended Myanmar Verb Phrase type (1) In extended verb phrase type (1), one or more adverbs are before the head verb and one or more verb prepositions are after the head verb. တုိးခ်ဲ့ၾကိယာပုဒ္ ၾကိယာအထူးၿပဳ + ဦးေဆာင္ၾကိယာ+ ၾကိယာ၀ိဘတ္ EVP ADV + HV + POVP ေခ်ာေခ်ာေမာေမာ ေရာက္ သည္။(arrive at safely) Extended Myanmar Verb Phrase type (2) In extended verb phrase type (2), one or more verb particles and verb propositions are after the head verb. တုိးခ်ဲ့ၾကိယာပုဒ္ ဦးေဆာင္ၾကိယာ+ၾကိယာေထာက္ပစၥည္း+ ၾကိယာ၀ိဘတ္ EVP HV + PAVP+POVP စားေနသည္။ ( is eating) Extended Myanmar Verb Phrase type (3) According to the extended verb phrase type (3), the negative particle can be included before the head verb. If verb particle is after the head verb, the negative particles may be between the head verb and verb particle. Then, one or more verb prepositions can be following. တုိးခ်ဲ့ၾကိယာပုဒ္ အၿငင္း၀ိဘတ္+ ဦးေဆာင္ၾကိယာ+/- ၾကိယာေထာက္ တစ္ခု/တစ္ခုထက္ပုိ + ၾကိယာ၀ိဘတ္ တစ္ခု/တစ္ခုထက္ပို EVP PANEG +HV+ PAVPS +/-POVP မ အိပ္ ခ်င္ေသး ဘူး။ (don’t want to sleep) Extended Myanmar Verb Phrase type (4) In extended verb phrase type (4), one or more verb modifiers are before the head verb and one or more verb preposition are after the head verb. တိုးခ်ဲ့ၾကိယာပုဒ္ အေလးအနက္ၿပဳ+ ဦးေဆာင္ၾကိယာ+ ၾကိယာ၀ိဘတ္ တစ္ခု/တစ္ခုထက္ပို EVP ADVM + HV+ POVP အရမ္း ေကာင္းသည္။ (is very good) 6. Translation with Morphological Analysis for Myanmar Verb Phrase In Myanmar, Verb does not change its form based o the gender of the subject/object rather it changes with respect to tense, aspect, modality and number only. Including different spelling, there are 38 inflected forms of the root verb in Myanmar. Table 4.1 list the tense suffixes for these different forms. As stated before, Myanmar Verb morphology has some non-linear characteristics. Often, the root changes its form when certain suffixes are added to it based on tenses and on many occasion, it varies non-linear. For example, the verb စား (eat) when followed by suffix “ေန သည္” (present continuous), become “စားေနသည္” , whereas when followed by the suffix “ခဲ့ သည္” (simple past tense) becomes “ စားခဲ့ သည္” , the suffixes “ၿပီးၿပီ” ( future tense ) becomes စားၿပီးၿပီ . The negative meaning of prefix “မ” becomes (does/do not work) “မစားးပါ”, suiffix “ၾက” (plural of subject) becomes “စားၾကသည္” respectively. Similarly, the verb “အလုပ္လုပ္” (work) when followed by suffix nay (present continuous) becomes “အလုပ္လုပ္ေနသည္”. But, the negative meaning of prefix ma becomes (does/do not work) “အလုပ္မလုပ္ပါ”. Thus, the addition of the prefix “မ” changes the root forms of “ အလုပ္လုပ”္ to “အလုပ္မလုပ္” , which is an indication of non linearity. Myanmar verb can be divided into three main categories: Individual Verb, Compound Verb and Adjective Verb. For example: individual verb: စားသည္ ‘eat’; compound verb: ေျပးဖက္သည္ ‘run and hug’; Adjective Verb: ေပ်ာ္သည္ pw-ti ‘is happy’. Some verbs can be used to support other verbs. For example: ေျပာသည္ ‘tell’ and ေပးသည္ ‘give’ are individual verbs and can be used as main verbs in sentences. But in this verb ေျပာေပးသည္ ‘tell’, ေပး ‘give’ is not the main Verb Phrase Basic Verb Phrase Extended Verb Phrase Extended verb Phrase type (1) Extended verb Phrase type (2) Extended verb Phrase type (3) Extended verb Phrase type (4)… Paper ID: 06091303 94
  • 6. International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064 Volume 2 Issue 9, September 2013 www.ijsr.net verb. It behaves particle to support the main verb ေျပာ ‘tell’. More than two individual English verbs can include in Myanmar compound verb. For example: three individual verbs: ၾကြေရာက္ ‘come’, အားေပး ‘encourage’, ခ်ီးျမွင္ ‘award’ include in compound verb : ၾကြေရာက္အားေပးခ်ီးျမွင့္သည္ ‘come and encourage and award’. ‘ၾကြေရာက္အားေပးခ်ီးျမွင့္သည္’ is Myanmar Compound verb. It has three English individual verbs “come, encourage and award”. Verb particle ၾက t can be omitted in the sentence. For example: ေက်ာင္းသားမ်ား ကစားေန ၾကသည္။ ‘Students are playing.’ And ေက်ာင္းသားမ်ား ကစားေန သည္။ ‘Students are playing’. Compound Verbs pose special problems to the robustness of a translation method, because the word itself must be represented in the training data: the occurrence of each of the components is not enough. 7. Markov Model Markov Model has been widely used in several of Natural Language Processing tasks. (such as POS tagging, Spell Checking, Machine Translation, Automatic Text Summarization, Information Retrieval (IR), Automatic Text Extraction and so on. This system developed a Markov Model to identify Myanmar Verb Phrases based on predefined Myanmar Grammar Rule-Based Maximum Matching Approach of totally 200 rules. This model constructed both Simple and complex sentences of nearly 2000 sentences. Figure 3: Markov Model for MVP PANG={မ} V= {သြား, စား ,အိပ္ ,ခ်က္ၿပဳတ္, ကန္ ္, ခြဲ,..} C= { ၍,နွင့္..} POREP= {တြင္, မွာ , နဲ႔,…} ADV= {လ်င္ၿမန္စြာ, ခင္ခင္မင္မင္,အေၿပးအလြား,… } PAVP= {ခဲ့ ,လိမ့္, ေန,…} POVP= {သည္ ,မည္ ,ဘူးလား ,ဘူး ,ပါ ,ဧ။္,..} 8. Algorithm for Myanmar Verb Phrase Identification Input: A= {word1, word2… wordn}// Set Segmented words with Part of Speech (Myanmar Sentence.) Output: Myanmar Verb Phrase into English Proper English Verb Phrase //Translate Verb Phrase using Myanmar- English Bilingual Lexicon. Begin Steps: 1. Read input sentence A. 2. Set i =0; 3. Input [i] =A. next token ();//Read input sentence A and tokenized the words by “/” and set to array [i]. 4. For(s=0; s<=i: s++) 4.1 Find VAC, VST or VCP from Input [i].// where VAC is Act On verb, VST is State Verb and VCP is Compound Verb. 4.2 If (input[i] = = “VAC” ||input [i] ==“VST” ||input[i] = =“VCP”) then k=s; // set k to s EndIf End//for if (input[k+2] == “ POVP” || input[k+4] = =“POVP” )then {Input[k] = “HV”; Identify myanmar verb phrase; } EndIf ENDIF 5. DISPLAY MYANMAR VERB PHRASE END. 9. Experimental Results The proposed system, there are nearly 2000 training sentences and 1500 testing sentences. Myanmar 3 font is used for Myanmar Language. The sentences consist of 5 to 35 words. We divided sentences into simple sentences and complex sentences. The simple sentences are declarative, negative and interrogative. Three types of complex sentences are joined with particles, adjective and adverb respectively. The accuracy of verb phrase identification is calculated by using well-known measure precision; recall and F-measure in equation (1), (2) and (3).This system ignore the words order. We have a little limitation in some simple and complex sentences. POREP C PANG PAVP POVP ADV V Paper ID: 06091303 95
  • 7. International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064 Volume 2 Issue 9, September 2013 www.ijsr.net Table 1: Evaluation Results for Verb Phrase Identification 0 0.2 0.4 0.6 0.8 1 1.2 % for Simple and complex  sentences Simple sentence Complex sentence Table 2: Evaluation results for Verb Phrase identification Type of sentences No of Sentences Precisio n Recal l F- measured Simple Sentences 1000 0.97 0.83 0.89 Complex Sentences 500 0.94 0.66 0.77 9.1 Error Analysis Errors in proposed system are as follow. Compound verb has two meaning. သြားခဲ့ and စားခဲ့သည္ and meaning of သြားစားခဲ့သည္ဲ့ is (went and ate). Although our system can translate it as သြားသည္: go) and (စားခဲ့သည္ ate), we have difficulty to translate (သြားစားခဲ့သည္) : went and ate) to get correct translation. Some verb support to previous verb: ေၿပာေပးသည္ give), correct translation is “talk”. Beside then in the negative inflection of verb has more error because negative particle of Myanmar “မ” can take as prefix or middle of stem verb such as (“မေၿပာဘူး”:not tell) and (“”နားမေထာင္ဘူး:not listen). In the latter case (“နားမေထာင္ဘူး”) is analyzed as “နား” and “မေထာင္ဘူး” which as (ear and not stand). In adjective, we have same error like negative verb inflection like (ရိုေသ respectful) of negative form as (“မရိုေသ”: not respectful) or (“မရိုမေသ”: not respectful). Although the word of “မရုိေသ” is not problem in analyzer, the word “မရိုမေသ” has error occurs. 10. Conclusion In Natural Language Processing, Phrase identification is one of the most critical and widely used as research area. Verb Phrase identification within a sentence is very useful for a variety of application in Natural Language Processing (NLP). In this paper, Myanmar Verb Phrase identification Algorithm is proposed by developing a Markov Model to show statistical results. In experimental result, a proposed algorithm shows the efficient results with precision, recall and F-measure in simple sentences and complex sentences. As a future work, after identifying the Myanmar Verb Phrase, translates to English Verb Phrase by using Myanmar-English Bilingual Lexicon. The design and algorithm of the Myanmar Verb Phrase identification and translation system developed in this research can be extended in further research directions in the fields of NLP and IR such as text categorization, document summarization, question answering, query processing and document ranking in search engine development etc. References [1] Fridah Katushemererwr et al, “Finite State Methods in Morphological Analysis of Runyakitara Verbs” Nordic Journal of African Studies. [2] Goldwater Sharon and McClosky David, 2005, Improving Statistical MT through Morphological Analysis. Proceedings of Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing, pages 676-683, Vancouver , October [3] J.Okeli, a.Allot, “Burmesse/Myanmar dictionary of grammatical forms”. [4] Koehn , P.F.J. Och et al, “ Statistical Phrase-Based Translation”, Processing of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, PP868- 876, Pragun. [5] Kim, Changhyun et al, “Verb Pattern Based Korean- Chinese Machine Translation System” [6] KamaijeeKaur Batra and GS Lehal, “Rule-Based Machine Translation of Noun Phrase from Punjabi to English”, International Journal of Computer Sciences Issue, 2010 [7] Md. Musfique Anwaar, Mohammad Aabed Anwar et al “Syntax Analysis and Machine Translation of Bangla Sentences”, Dept. of Computer Science & Engineering, Jahangirnagar University, Bangladesh [8] M. Selvam et al “ Improvement of Rule-Based Morphological Analysis and POS Tagging in Tamil Language via Projection and Induction Techniques." INTERNATIONAL JOURNAL OF COMPUTERS, Issue 4, Volume 3, 2009 [9] Nguyen et al, “Improving Phrase-Based SMT with Morpho-Syntactic Analysis and Transformation. Proceeding of the conference on Empirical Method in Natural Language Processing and Very Large Corpora, University of Maryland, College Park, MD, pp 20-28 [10] N. K. Choudhary, “Developing a Computational Framework for the Verb Morphology of Great Andamanese”, Centre for Linguistics in India, JNU, 2006. [11] Wajid Ali et al, “A hybrid approach to Urdu Verb Phrase Chunking.”, Department of the Myanmar Language Commission, Ministry of Education, Union of Myanmar 2005 Author Profile Thae Thae Soe received the B.C.Sc. and M.C.Sc. degrees in University of Computer Studies, Mandalay, Myanmar in 2004 and 2008, respectively. I am also an assistance lecturer and Paper ID: 06091303 96
  • 8. International Journal of Science and Research (IJSR), India Online ISSN: 2319-7064 Volume 2 Issue 9, September 2013 www.ijsr.net a Ph.D candidate of University of Computer Studies, Mandalay. My research field is Natural Language Processing (NLP). I am very interested in NLP. Paper ID: 06091303 97