SlideShare una empresa de Scribd logo
1 de 33
Machine Translation withStatisticalApproach 1
Whatis the machine translation??? Machine translation is the study of designingsystemsthat translate from one humanlanguage in to another. Machine translation system essentiallytakes a text in one language (called the source language), and translate itintoanotherlanguage(calledtargetlanguage). The source  and targetlanguage are naturallanguagessuch as english and hindi. 2
Contd……..	 This is the hard problem, sinceprocessingnaturallanguagerequiresworkatseverallevles, and complexities and ambiguitiesariesateach of thoselevles. Hence an MT system canbesaid to bedoingnaturallanguageprocessing(NLP).In fact,most machine translation application requiressomedegree of naturallanguageunderstanding to do the translation. 3
History of Machine Translation  Machine translation as a discipline dates back to the earlynineteen-fifties. The complexity of the problemwasoriginallyunderestimated, and someearlysuccessfuldemonstrations of experimental system lead to unrealisticexpectionswhichwere hard to fulfil. In the early eighties, the JapaneseFifthGenerationComputing Project revivedinterest in thiswork. The currentapproach to MT is more pragmatic and realistic. 4
Contd…. It isnowwidelyacceptedthatfullyautomatic, general-purpose , highquality machine translation is a verydifficultproblem, but veryuseful and pratical system canneverthelessbedeveloped by realxing one or more of thesecriteria,andseveralusefulsystems have been built by doingso,and are in use today. Suchsystems are beingused to translate public announcements,weather bulletins, technical documents, and web pages. 5
Contd.. Some machine translation services are starting to becomeavailable on the world wide web. For example,the web page of the Google searchenginealsoprovides a translation service thatcan translate simple sentences among a handful of languages. 6
Translation telephonetechnology(speech to speech translation) The ‘Janus’ projectat the Interactive System lab, Carnegie Mellon University, Is working on set of translation project. You dial yourcolleague in tokyo. You do not speakJapanese, and hedoes not speakenglish.Soyouneed system suchthatyouspeakinto the phone in english, whichautomaticallygets translate intojapanese for him, he replies in japanese, and youhearit in english. 7
Research MT System Example:thejanustranlsating Phone project This prototype system allowstwousers to communicate in a givendomain via a videoconferencingconnection. Each party sees the other conversant, hearshis/herorginalvoicesees/hears translation of whathe/shesays as subtitles, caption and synthetic speech. The situation iscooperative, That isbothuserswant to understandeachother and collaborate via the system to achieveunderstanding. 8
Contd…. After the record buttonisactivated, the station acceptsspoken input and produces a paraphrase of the input sentence first. Once the user has verifiedthat the system properlyunderstood the intendedmeaning, he/sheactivate the sendbutton to send a translation of thisintendedmeaning to the otherside in the desiredlanguage. Various interactive correction mechanismsfacilitate quick recovery, should possible processingerros and miscommunication have altered the intendedmeaning. 9
Machine Translation & Artificial Intelligence MT is an important sub-discipline of the widerfield of Artificial Intelligence(AI). AI(amongotherthings)deals withgetting machine to exhibit intelligent behaviour. As wemightimagine,both AI and MT are interesting and challengingfields. 10
        Component of MT Wecandivide the machine translation taskintothree main phases:- The system has to first analyse the source language input to createsomeinternalrepresentetion. It thentypicallymanipulatesthisinternalrepresentationtotransferit to a formsuitable for a targetlanguage. Finally,itgenerates the output in the targetlanguage. 11
Analysis Transfer Generation Source Language  Target Language  Intermediate Representation based on source language Intermediate Representation based on target language 12
Contd… A typical MT system contains components for analysis ,transfer and generation as shown in diagram. These components incorporate a lot of knowledge about words(Lexical Knowledge), and about the language (LinguisticKnowledge). Suchknowledgeisstored in one or more lexicons ,and possiblyother sources of linguisticknowledge ,such as grammar.   13
Contd… The user interface isinvariably a crucial part of most MT system. The interface allows user to verify,disambiguate and if necessary correct the output of the system. Anothercommonfeature of NLP workis use of large ‘corpora’. A corpus is a large collection of textwhichisused for acquiring the required lexical and linguisticknowledge.  14
Contd… Somesystemsprefer to split the lexiconinto a source lexicon, a targetlexicon,and a transferlexiconthatmapsbetween the two. An MT lexicontypicallyneeds to bemuch more formal,precise and elaboratethan a typicalhumandictionary,sinceitismeant for mechanicalprocessing,and not for reading by humans. The lexiconplays a central role in modern MT system. 15
Lexicon The lexiconis an important component of MT system. A lexiconcontains all the relevant information about words and phrases thatisrequired for the variouslevels of analysis and generation. A typicallexicon entry for a wordwouldcontain the following information about the word:the part of speech,information about the equivalentword in the targetlanguage.  16
Approaches to MT Based on how closely the internalrepresentationdepends on the source and targetlanguages,approaches to MT canbedividedintothree major classes-  Direct. Transfer-based. Inter-lingual.   17
A direct MT system tries to directlymap the source language to the targetlanguage , and isthereforehighlydependent on both the source and targetlanguages. A transfer-basedapproach first converts the source languageinto an internalrepresentation (IRs)whichisdependent on the source but not the targetlanguage.The system thentransformIRsinto a formIRtwhichisindependent of the source language and dependsonly on the targetlanguage and finallygenerates the targetlanguage output fromIRt.     18
… The Inter-lingualapproachconverts the input into a single internalrepresentation(IR) thatisindependent of both source and targetlanguages,andthenconvertsfromthisinto the output. 19
Levels of Natural LanguageProcessing Dealingwithnaturallanguagetypicallyrequiresprocessingatvariouslevels.Inincreasingorder of difficulty,they are:- The Lexical Level(or the Word Level) The SyntacticLevel(or the Sentence Level) The SemanticLevel(or the MeaningLevel) The Discourse and PragmaticLevel(or the Conversation ContextLevel). 20
The Lexical Level This level deals withlookingat the input string of characters and seperatingthemintotokens,whichmaybewords,space or punctuation. This levelalso deal with issues likehyphenatedwords,andmisspeltwords. It is the lexical levelwhich tells us that the input ‘’hejoined the parti’’consist of four words of which the last is incorrect. This levelissometimescalled ‘tokenisation’or ‘lexical analysis’. 21
The SyntacticLevel This level deals withidentifying the structure of a sentence,andverifyingwhether a sentence isgrammatically correct. This leveltypicallyconsist of a ‘parser’ which looks at the grammar of the language,and the input sentence,and tries to form a ‘parseTree’. If itcanform a parsetree ,the sentence issyntactically correct and the parsetreegives us the structure and the function of various components. 22
For ex., a typical English sentence wouldconsist of a subject and predicate.Thesubjectisnormally a noun phrase and the predicateis a verbphrase,andso on. The syntacticlevel tells us the sentence ‘’He the party joined’’ is (syntactically) incorrect, eventhougheachword in itis (lexically) correct. 23
           The SemanticLevel This level deals with the meaning of the input and its components. It is the semanticlevelwhich tells us that the sentence ‘’He ate the Party’’ issemanticallyincorrect,thoughitislexically and syntacticallywellformed. In general, semanticanalysisinvolvesknowledge about the world,orat least the relevant aspect of world.  24
The Conversation ContextLevel This level deals with the information carriedacross multiple sentences, and with information thatis not explicit in the input, but isimplicit in the socio-cultural context of the input pessage or conversation. For ex., the expectedanswer to the question ‘’Do you know what the time is?’’issomethinglike ‘’4p.m.’’ , and not just ‘’Yes’’though the latter islexically,syntactically and symanticallyaccurate. 25
Issues in Machine Translation Machine Translation(and Natural LanguageProcessing) is a difficultproblem. There are two mains reasons, which are related to it. The first reasonisthatnaturallanguageishighlyambiguous.Theambiguityoccurat  all levels-lexical,syntactic,semantic and pragmatic.Agivenword or sentence can have more than one meaning.Forex,theword ‘’party’’ couldmean a polyticalparty,or a social event,anddeciding the suitable one in perticular case is crucial to getting right analysis and therefore right translation  26
The second reasonisthatwhenhuman use naturallanguage , they use an enormousamount of commonsense, and knowledge about the world, whichhelps to resolve the ambiguity. For ex., in ‘’He went to the bank,butitwasclosed for lunch’’,wecaninferthat ‘bank’ refers to a financial institution, and not a river bank,becausewe know fromourknowledge of the world thatonly the former type of bankcanbeclosed for lunch.   27
The StatisticalApproach          (Warren Weaver,1949) Theyconsideronly the translation of indivisual sentences. Usually, there are many acceptable translation of a perticular sentence the choiceamongthembeinglargely a matter of taste. Theytake the viewthatevery sentence in one languageis a possible translation of any sentence in the other. 28
Theyassign to every pair of sentences (S,T) a probability P(S/T) ie. Probabilitythat a translator willproduce T in the targetlanguagewhenpresentedwith S in the source language. Given a sentence T in the targetlanguage,theytry to seek the sentences S fromwhich the translator produces T. The chance of errorisminimized by choosingthat sentence S thatismost probable given T. Thus,theywish to choose S so as to maximize P(S/T). 29
UsingBayse’ theorm                                                     P(S/T)  =  P(S).P(T/S) / P(T) The denominator on the right of thisequationdoes not depend on S, and soitsuffices to choose the S thatmaximizes the product P(S)P(T/S) .                                                                      where,                                                                              P(S) is the language model probability of S , and     P(T/S) is the translation probability of T given S. 30
Conclusion Twophenomena have given a new impetus to machine translation work-the globalisation of the world economy, and the explosion of the internet and World Wide Web. Boththesedevelopmentsmeanthatthereis a need for making an immense collection of naturallanguage documents available to multilingual global audience, and translation tools  and system can go a long way in meeting thatneed. 31
The global translation marketisestimated to beat least 12 billion dollars. System thatautomatically translates Kalidasa and Shakespeare maystillbe  a distant dream, but system that translate stock marketreport,weather bulletins and technicalmeasures are a reality today, and will continue to play an increasingly important role in the society of the next millenium. 32
THANK YOU 33

Más contenido relacionado

La actualidad más candente

The translation of metaphor
The translation of metaphorThe translation of metaphor
The translation of metaphor
Amer Minhas
 
Transformational generative grammar
Transformational  generative grammarTransformational  generative grammar
Transformational generative grammar
Baishakhi Amin
 

La actualidad más candente (20)

Phrase structure rules
Phrase structure rulesPhrase structure rules
Phrase structure rules
 
Introduction To Translation Technologies
Introduction To Translation TechnologiesIntroduction To Translation Technologies
Introduction To Translation Technologies
 
Syntactic analysis.pptx
Syntactic analysis.pptxSyntactic analysis.pptx
Syntactic analysis.pptx
 
The translation of metaphor
The translation of metaphorThe translation of metaphor
The translation of metaphor
 
Translation Studies
Translation StudiesTranslation Studies
Translation Studies
 
Generative grammar
Generative grammarGenerative grammar
Generative grammar
 
Translation Technology - a brief but useful introduction
Translation Technology -  a brief but useful introductionTranslation Technology -  a brief but useful introduction
Translation Technology - a brief but useful introduction
 
Development of translation theory (ling)
Development of translation theory (ling)Development of translation theory (ling)
Development of translation theory (ling)
 
Translation Strategies, by Dr. Shadia Y. Banjar
Translation Strategies, by Dr. Shadia Y. BanjarTranslation Strategies, by Dr. Shadia Y. Banjar
Translation Strategies, by Dr. Shadia Y. Banjar
 
Phonetic Form
Phonetic FormPhonetic Form
Phonetic Form
 
Transformational generative grammar
Transformational  generative grammarTransformational  generative grammar
Transformational generative grammar
 
Statistical machine translation
Statistical machine translationStatistical machine translation
Statistical machine translation
 
Machine Translation Introduction
Machine Translation IntroductionMachine Translation Introduction
Machine Translation Introduction
 
Inflection in Morphology (Linguistics)
Inflection in Morphology (Linguistics)Inflection in Morphology (Linguistics)
Inflection in Morphology (Linguistics)
 
Phrase Structure Rules
Phrase Structure RulesPhrase Structure Rules
Phrase Structure Rules
 
Translation
TranslationTranslation
Translation
 
Historical linguistics introduction
Historical linguistics introductionHistorical linguistics introduction
Historical linguistics introduction
 
Language & mind
Language &  mindLanguage &  mind
Language & mind
 
What is pragmatics ppt final
What is pragmatics ppt finalWhat is pragmatics ppt final
What is pragmatics ppt final
 
Transformational Grammar
Transformational GrammarTransformational Grammar
Transformational Grammar
 

Destacado

Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
vini89
 
NLP and its applications
NLP and its applicationsNLP and its applications
NLP and its applications
Utphala P
 
Natural Language Processing: Definition and Application
Natural Language Processing: Definition and ApplicationNatural Language Processing: Definition and Application
Natural Language Processing: Definition and Application
Stephen Shellman
 
Computer Aided Translation
Computer Aided TranslationComputer Aided Translation
Computer Aided Translation
Philipp Koehn
 
Natural language processing 2
Natural language processing 2Natural language processing 2
Natural language processing 2
Tony Vo
 
Similarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguationSimilarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguation
vini89
 
Machine Translation And Computer Assisted Translation
Machine Translation And Computer Assisted TranslationMachine Translation And Computer Assisted Translation
Machine Translation And Computer Assisted Translation
Teritaa
 

Destacado (20)

Statistical machine translation in a few slides
Statistical machine translation in a few slidesStatistical machine translation in a few slides
Statistical machine translation in a few slides
 
Machine Translation: What it is?
Machine Translation: What it is?Machine Translation: What it is?
Machine Translation: What it is?
 
A statistical approach to machine translation
A statistical approach to machine translationA statistical approach to machine translation
A statistical approach to machine translation
 
Towards OpenLogos Hybrid Machine Translation - Anabela Barreiro
Towards OpenLogos Hybrid Machine Translation - Anabela BarreiroTowards OpenLogos Hybrid Machine Translation - Anabela Barreiro
Towards OpenLogos Hybrid Machine Translation - Anabela Barreiro
 
Google services
Google servicesGoogle services
Google services
 
Summary of Rule-based Reordering Space in Statistical Machine Translation
Summary of Rule-based Reordering Space in Statistical Machine TranslationSummary of Rule-based Reordering Space in Statistical Machine Translation
Summary of Rule-based Reordering Space in Statistical Machine Translation
 
7. ebmt based on st sm
7. ebmt based on st sm7. ebmt based on st sm
7. ebmt based on st sm
 
WEBINAR: TAUS Outlook 2013
WEBINAR: TAUS Outlook 2013WEBINAR: TAUS Outlook 2013
WEBINAR: TAUS Outlook 2013
 
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Manuel Herranz, Pangean...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Manuel Herranz, Pangean...TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Manuel Herranz, Pangean...
TAUS OPEN SOURCE MACHINE TRANSLATION SHOWCASE, Paris, Manuel Herranz, Pangean...
 
TAUS USER CONFERENCE 2010, The Deep Hybrid machine translation engine
TAUS USER CONFERENCE 2010, The Deep Hybrid machine translation engineTAUS USER CONFERENCE 2010, The Deep Hybrid machine translation engine
TAUS USER CONFERENCE 2010, The Deep Hybrid machine translation engine
 
Tools for translators: some theory & background
Tools for translators: some theory & backgroundTools for translators: some theory & background
Tools for translators: some theory & background
 
Hcs
HcsHcs
Hcs
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
 
NLP and its applications
NLP and its applicationsNLP and its applications
NLP and its applications
 
Natural Language Processing: Definition and Application
Natural Language Processing: Definition and ApplicationNatural Language Processing: Definition and Application
Natural Language Processing: Definition and Application
 
Computer Aided Translation
Computer Aided TranslationComputer Aided Translation
Computer Aided Translation
 
Jeeves -natural language interface application
Jeeves -natural language interface applicationJeeves -natural language interface application
Jeeves -natural language interface application
 
Natural language processing 2
Natural language processing 2Natural language processing 2
Natural language processing 2
 
Similarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguationSimilarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguation
 
Machine Translation And Computer Assisted Translation
Machine Translation And Computer Assisted TranslationMachine Translation And Computer Assisted Translation
Machine Translation And Computer Assisted Translation
 

Similar a Machine translation with statistical approach

Prolog (present)
Prolog (present) Prolog (present)
Prolog (present)
Melody Joey
 
Vl3.culture plex presentation
Vl3.culture plex presentationVl3.culture plex presentation
Vl3.culture plex presentation
CameliaN
 
Vl3.cultureplex presentation
Vl3.cultureplex presentationVl3.cultureplex presentation
Vl3.cultureplex presentation
CameliaN
 
Vl3.culture plex presentation
Vl3.culture plex presentationVl3.culture plex presentation
Vl3.culture plex presentation
CameliaN
 
Vl3.lab presentation
Vl3.lab presentationVl3.lab presentation
Vl3.lab presentation
CameliaN
 
Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...
write4
 
Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...
write5
 

Similar a Machine translation with statistical approach (20)

Computational linguistics
Computational linguisticsComputational linguistics
Computational linguistics
 
A DECADE OF USING HYBRID INFERENCE SYSTEMS IN NLP (2005 – 2015): A SURVEY
A DECADE OF USING HYBRID INFERENCE SYSTEMS IN NLP (2005 – 2015): A SURVEYA DECADE OF USING HYBRID INFERENCE SYSTEMS IN NLP (2005 – 2015): A SURVEY
A DECADE OF USING HYBRID INFERENCE SYSTEMS IN NLP (2005 – 2015): A SURVEY
 
Natural Language Processing Theory, Applications and Difficulties
Natural Language Processing Theory, Applications and DifficultiesNatural Language Processing Theory, Applications and Difficulties
Natural Language Processing Theory, Applications and Difficulties
 
NLPinAAC
NLPinAACNLPinAAC
NLPinAAC
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
Natural Language Processing: A comprehensive overview
Natural Language Processing: A comprehensive overviewNatural Language Processing: A comprehensive overview
Natural Language Processing: A comprehensive overview
 
Machine Translation Approaches and Design Aspects
Machine Translation Approaches and Design AspectsMachine Translation Approaches and Design Aspects
Machine Translation Approaches and Design Aspects
 
A Short Introduction To Text-To-Speech Synthesis
A Short Introduction To Text-To-Speech SynthesisA Short Introduction To Text-To-Speech Synthesis
A Short Introduction To Text-To-Speech Synthesis
 
Prolog (present)
Prolog (present) Prolog (present)
Prolog (present)
 
Vl3.culture plex presentation
Vl3.culture plex presentationVl3.culture plex presentation
Vl3.culture plex presentation
 
Vl3.cultureplex presentation
Vl3.cultureplex presentationVl3.cultureplex presentation
Vl3.cultureplex presentation
 
Vl3.culture plex presentation
Vl3.culture plex presentationVl3.culture plex presentation
Vl3.culture plex presentation
 
An Overview Of Natural Language Processing
An Overview Of Natural Language ProcessingAn Overview Of Natural Language Processing
An Overview Of Natural Language Processing
 
Vl3.lab presentation
Vl3.lab presentationVl3.lab presentation
Vl3.lab presentation
 
NL Context Understanding 23(6)
NL Context Understanding 23(6)NL Context Understanding 23(6)
NL Context Understanding 23(6)
 
Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...
 
Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...
 
Untitled presentation.pdf
Untitled presentation.pdfUntitled presentation.pdf
Untitled presentation.pdf
 
AN INTERSEMIOTIC TRANSLATION OF NORMATIVE UTTERANCES TO MACHINE LANGUAGE
AN INTERSEMIOTIC TRANSLATION OF NORMATIVE UTTERANCES TO MACHINE LANGUAGEAN INTERSEMIOTIC TRANSLATION OF NORMATIVE UTTERANCES TO MACHINE LANGUAGE
AN INTERSEMIOTIC TRANSLATION OF NORMATIVE UTTERANCES TO MACHINE LANGUAGE
 
An Intersemiotic Translation of Normative Utterances to Machine Language
An Intersemiotic Translation of Normative Utterances to Machine LanguageAn Intersemiotic Translation of Normative Utterances to Machine Language
An Intersemiotic Translation of Normative Utterances to Machine Language
 

Más de vini89

Fuzzy logic
Fuzzy logicFuzzy logic
Fuzzy logic
vini89
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
vini89
 
Ai presentation
Ai presentationAi presentation
Ai presentation
vini89
 
Similarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguationSimilarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguation
vini89
 

Más de vini89 (7)

Fuzzy logic
Fuzzy logicFuzzy logic
Fuzzy logic
 
Ann
Ann Ann
Ann
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
 
Ai
Ai Ai
Ai
 
Ai presentation
Ai presentationAi presentation
Ai presentation
 
Similarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguationSimilarity based methods for word sense disambiguation
Similarity based methods for word sense disambiguation
 
Mycin
MycinMycin
Mycin
 

Último

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
Chris Hunter
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
MateoGardella
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
heathfieldcps1
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 

Último (20)

Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Making and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdfMaking and Justifying Mathematical Decisions.pdf
Making and Justifying Mathematical Decisions.pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
fourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writingfourth grading exam for kindergarten in writing
fourth grading exam for kindergarten in writing
 
psychiatric nursing HISTORY COLLECTION .docx
psychiatric  nursing HISTORY  COLLECTION  .docxpsychiatric  nursing HISTORY  COLLECTION  .docx
psychiatric nursing HISTORY COLLECTION .docx
 
Gardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch LetterGardella_PRCampaignConclusion Pitch Letter
Gardella_PRCampaignConclusion Pitch Letter
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Class 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdfClass 11th Physics NEET formula sheet pdf
Class 11th Physics NEET formula sheet pdf
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 

Machine translation with statistical approach

  • 2. Whatis the machine translation??? Machine translation is the study of designingsystemsthat translate from one humanlanguage in to another. Machine translation system essentiallytakes a text in one language (called the source language), and translate itintoanotherlanguage(calledtargetlanguage). The source and targetlanguage are naturallanguagessuch as english and hindi. 2
  • 3. Contd…….. This is the hard problem, sinceprocessingnaturallanguagerequiresworkatseverallevles, and complexities and ambiguitiesariesateach of thoselevles. Hence an MT system canbesaid to bedoingnaturallanguageprocessing(NLP).In fact,most machine translation application requiressomedegree of naturallanguageunderstanding to do the translation. 3
  • 4. History of Machine Translation Machine translation as a discipline dates back to the earlynineteen-fifties. The complexity of the problemwasoriginallyunderestimated, and someearlysuccessfuldemonstrations of experimental system lead to unrealisticexpectionswhichwere hard to fulfil. In the early eighties, the JapaneseFifthGenerationComputing Project revivedinterest in thiswork. The currentapproach to MT is more pragmatic and realistic. 4
  • 5. Contd…. It isnowwidelyacceptedthatfullyautomatic, general-purpose , highquality machine translation is a verydifficultproblem, but veryuseful and pratical system canneverthelessbedeveloped by realxing one or more of thesecriteria,andseveralusefulsystems have been built by doingso,and are in use today. Suchsystems are beingused to translate public announcements,weather bulletins, technical documents, and web pages. 5
  • 6. Contd.. Some machine translation services are starting to becomeavailable on the world wide web. For example,the web page of the Google searchenginealsoprovides a translation service thatcan translate simple sentences among a handful of languages. 6
  • 7. Translation telephonetechnology(speech to speech translation) The ‘Janus’ projectat the Interactive System lab, Carnegie Mellon University, Is working on set of translation project. You dial yourcolleague in tokyo. You do not speakJapanese, and hedoes not speakenglish.Soyouneed system suchthatyouspeakinto the phone in english, whichautomaticallygets translate intojapanese for him, he replies in japanese, and youhearit in english. 7
  • 8. Research MT System Example:thejanustranlsating Phone project This prototype system allowstwousers to communicate in a givendomain via a videoconferencingconnection. Each party sees the other conversant, hearshis/herorginalvoicesees/hears translation of whathe/shesays as subtitles, caption and synthetic speech. The situation iscooperative, That isbothuserswant to understandeachother and collaborate via the system to achieveunderstanding. 8
  • 9. Contd…. After the record buttonisactivated, the station acceptsspoken input and produces a paraphrase of the input sentence first. Once the user has verifiedthat the system properlyunderstood the intendedmeaning, he/sheactivate the sendbutton to send a translation of thisintendedmeaning to the otherside in the desiredlanguage. Various interactive correction mechanismsfacilitate quick recovery, should possible processingerros and miscommunication have altered the intendedmeaning. 9
  • 10. Machine Translation & Artificial Intelligence MT is an important sub-discipline of the widerfield of Artificial Intelligence(AI). AI(amongotherthings)deals withgetting machine to exhibit intelligent behaviour. As wemightimagine,both AI and MT are interesting and challengingfields. 10
  • 11. Component of MT Wecandivide the machine translation taskintothree main phases:- The system has to first analyse the source language input to createsomeinternalrepresentetion. It thentypicallymanipulatesthisinternalrepresentationtotransferit to a formsuitable for a targetlanguage. Finally,itgenerates the output in the targetlanguage. 11
  • 12. Analysis Transfer Generation Source Language Target Language Intermediate Representation based on source language Intermediate Representation based on target language 12
  • 13. Contd… A typical MT system contains components for analysis ,transfer and generation as shown in diagram. These components incorporate a lot of knowledge about words(Lexical Knowledge), and about the language (LinguisticKnowledge). Suchknowledgeisstored in one or more lexicons ,and possiblyother sources of linguisticknowledge ,such as grammar. 13
  • 14. Contd… The user interface isinvariably a crucial part of most MT system. The interface allows user to verify,disambiguate and if necessary correct the output of the system. Anothercommonfeature of NLP workis use of large ‘corpora’. A corpus is a large collection of textwhichisused for acquiring the required lexical and linguisticknowledge. 14
  • 15. Contd… Somesystemsprefer to split the lexiconinto a source lexicon, a targetlexicon,and a transferlexiconthatmapsbetween the two. An MT lexicontypicallyneeds to bemuch more formal,precise and elaboratethan a typicalhumandictionary,sinceitismeant for mechanicalprocessing,and not for reading by humans. The lexiconplays a central role in modern MT system. 15
  • 16. Lexicon The lexiconis an important component of MT system. A lexiconcontains all the relevant information about words and phrases thatisrequired for the variouslevels of analysis and generation. A typicallexicon entry for a wordwouldcontain the following information about the word:the part of speech,information about the equivalentword in the targetlanguage. 16
  • 17. Approaches to MT Based on how closely the internalrepresentationdepends on the source and targetlanguages,approaches to MT canbedividedintothree major classes- Direct. Transfer-based. Inter-lingual. 17
  • 18. A direct MT system tries to directlymap the source language to the targetlanguage , and isthereforehighlydependent on both the source and targetlanguages. A transfer-basedapproach first converts the source languageinto an internalrepresentation (IRs)whichisdependent on the source but not the targetlanguage.The system thentransformIRsinto a formIRtwhichisindependent of the source language and dependsonly on the targetlanguage and finallygenerates the targetlanguage output fromIRt. 18
  • 19. … The Inter-lingualapproachconverts the input into a single internalrepresentation(IR) thatisindependent of both source and targetlanguages,andthenconvertsfromthisinto the output. 19
  • 20. Levels of Natural LanguageProcessing Dealingwithnaturallanguagetypicallyrequiresprocessingatvariouslevels.Inincreasingorder of difficulty,they are:- The Lexical Level(or the Word Level) The SyntacticLevel(or the Sentence Level) The SemanticLevel(or the MeaningLevel) The Discourse and PragmaticLevel(or the Conversation ContextLevel). 20
  • 21. The Lexical Level This level deals withlookingat the input string of characters and seperatingthemintotokens,whichmaybewords,space or punctuation. This levelalso deal with issues likehyphenatedwords,andmisspeltwords. It is the lexical levelwhich tells us that the input ‘’hejoined the parti’’consist of four words of which the last is incorrect. This levelissometimescalled ‘tokenisation’or ‘lexical analysis’. 21
  • 22. The SyntacticLevel This level deals withidentifying the structure of a sentence,andverifyingwhether a sentence isgrammatically correct. This leveltypicallyconsist of a ‘parser’ which looks at the grammar of the language,and the input sentence,and tries to form a ‘parseTree’. If itcanform a parsetree ,the sentence issyntactically correct and the parsetreegives us the structure and the function of various components. 22
  • 23. For ex., a typical English sentence wouldconsist of a subject and predicate.Thesubjectisnormally a noun phrase and the predicateis a verbphrase,andso on. The syntacticlevel tells us the sentence ‘’He the party joined’’ is (syntactically) incorrect, eventhougheachword in itis (lexically) correct. 23
  • 24. The SemanticLevel This level deals with the meaning of the input and its components. It is the semanticlevelwhich tells us that the sentence ‘’He ate the Party’’ issemanticallyincorrect,thoughitislexically and syntacticallywellformed. In general, semanticanalysisinvolvesknowledge about the world,orat least the relevant aspect of world. 24
  • 25. The Conversation ContextLevel This level deals with the information carriedacross multiple sentences, and with information thatis not explicit in the input, but isimplicit in the socio-cultural context of the input pessage or conversation. For ex., the expectedanswer to the question ‘’Do you know what the time is?’’issomethinglike ‘’4p.m.’’ , and not just ‘’Yes’’though the latter islexically,syntactically and symanticallyaccurate. 25
  • 26. Issues in Machine Translation Machine Translation(and Natural LanguageProcessing) is a difficultproblem. There are two mains reasons, which are related to it. The first reasonisthatnaturallanguageishighlyambiguous.Theambiguityoccurat all levels-lexical,syntactic,semantic and pragmatic.Agivenword or sentence can have more than one meaning.Forex,theword ‘’party’’ couldmean a polyticalparty,or a social event,anddeciding the suitable one in perticular case is crucial to getting right analysis and therefore right translation 26
  • 27. The second reasonisthatwhenhuman use naturallanguage , they use an enormousamount of commonsense, and knowledge about the world, whichhelps to resolve the ambiguity. For ex., in ‘’He went to the bank,butitwasclosed for lunch’’,wecaninferthat ‘bank’ refers to a financial institution, and not a river bank,becausewe know fromourknowledge of the world thatonly the former type of bankcanbeclosed for lunch. 27
  • 28. The StatisticalApproach (Warren Weaver,1949) Theyconsideronly the translation of indivisual sentences. Usually, there are many acceptable translation of a perticular sentence the choiceamongthembeinglargely a matter of taste. Theytake the viewthatevery sentence in one languageis a possible translation of any sentence in the other. 28
  • 29. Theyassign to every pair of sentences (S,T) a probability P(S/T) ie. Probabilitythat a translator willproduce T in the targetlanguagewhenpresentedwith S in the source language. Given a sentence T in the targetlanguage,theytry to seek the sentences S fromwhich the translator produces T. The chance of errorisminimized by choosingthat sentence S thatismost probable given T. Thus,theywish to choose S so as to maximize P(S/T). 29
  • 30. UsingBayse’ theorm P(S/T) = P(S).P(T/S) / P(T) The denominator on the right of thisequationdoes not depend on S, and soitsuffices to choose the S thatmaximizes the product P(S)P(T/S) . where, P(S) is the language model probability of S , and P(T/S) is the translation probability of T given S. 30
  • 31. Conclusion Twophenomena have given a new impetus to machine translation work-the globalisation of the world economy, and the explosion of the internet and World Wide Web. Boththesedevelopmentsmeanthatthereis a need for making an immense collection of naturallanguage documents available to multilingual global audience, and translation tools and system can go a long way in meeting thatneed. 31
  • 32. The global translation marketisestimated to beat least 12 billion dollars. System thatautomatically translates Kalidasa and Shakespeare maystillbe a distant dream, but system that translate stock marketreport,weather bulletins and technicalmeasures are a reality today, and will continue to play an increasingly important role in the society of the next millenium. 32