SlideShare una empresa de Scribd logo
1 de 4
Descargar para leer sin conexión
solutions




     Dispelling the myths of
     machine translation
     It is not surprising that myths, half-truths, and misunderstandings abound regarding machine translation: It seems
     as if the experience most players in the translation field have with this technology does not go beyond toying a little
     with one of the free online translation tools. Almost every week, I come across an article informing its readers either
     that machine translation is and always will be a complete waste of time or that machine translation, while being
     a waste of time today, might actually be useful some time in the distant future. In the hope of setting the record
     straight, here is a closer look at some of the most common myths about machine translation.




     Photo: Vasiliy Koval



     22                                                                                                                        AUGUST 2008




#5801_tcworld_04-08.indd 22                                                                                                         20.06.2008 14:17:19 Uhr
solutions



        By Uwe Muegge                                             pre-translated in a machine
                                                                  translation system.1
                                                              •   While most translations will
        Myth: Machine                                             require some editing and many
                                                                  even rewriting, it is fair to expect
        translation simply does                                   that a considerable percentage
        not work                                                  of machine-generated trans-
                                                                  lations turn out to be perfect
        With free online translation services available all       (this is especially true for short
        over the web, anyone can run a text through a             instructions, headings, legends,
        machine translation (MT) engine and then share            and the like).
        the results with the public as proof of the fact      •   ?At a minimum, key terms will
        that machine translation is capable of little more        be translated correctly and
        than the most rudimentary rough translations              consistently. And not only that,
        (gisting), and, of course, providing nearly endless       in most cases these terms will
        entertainment.                                            also be inflected correctly and
        The main problem with these ‘tests’ is that using         appear in the correct singular or
        any of the free online translation environments           plural form (try to do that with
        gives only a glimpse of the true power of a full-         your translation memory!)
        fledged professional machine translation system.                                              Example of a German>English machine translation from the author’s
        For example, the typical online translation service   Fact: Machine translation               website www.muegge.cc
        does not allow users to select a subject field or     enables the translation of
        provide user terminology, let alone set stylistic     material that would otherwise
        preferences. In fact, many - if not most - of the     not be translated
        free text translation tools support no translation    Very few organizations, if any,
        parameters other than the specification of the        currently translate all materials that
        language pair and the source text. No wonder          would benefit from translation into
        that the translations these machine translation       all the languages spoken by all of
        websites produce can be so ridiculously off target.   their current or future customers.
                                                              The primary reason for this is that
        Fact: Machine translation improves the                for many types of documents,            German search page for the Microsoft Knowledge Base with machine
        productivity and consistency of human                 especially in the after-sales domain, translation option enabled
        translators                                           the budget is simply not available
        Whenever new source text for a project is created,    for large-scale human translation.
        that text will have to be translated at some point.   A number of organizations are using machine              guage is as widely-held as it is wrong. All popular
        Even when you work in what is considered a            translation solutions for making large volumes           machine translation systems, including the free
        state-of-the-art globalization environment, i.e.      of text available to their global customers in           online translation services such as systransoft.
        an integrated content management/translation          their local language without involving any               com, translate.google.com, and windowsli-
        workflow system, you will end up with a certain       human translators in the process. The Microsoft          vetranslator.com employ highly sophisticated
        percentage of low match/no match sentences.           Knowledge Base, which contains more than                 algorithms that are the result of years of research
        In a well-planned and well-managed globalizati-       200000 documents in English, is a well-known             and development.
        on project where writers, as well as software de-     example of a text repository where the number
        velopers, use a comprehensive project glossary,       of machine-translated documents by far exceeds           Fact: There is not one but many very different
        as well as a style guide aimed at easy readability/   the number of those translated by humans.                machine translation technologies that are all
        comprehensibility, the low/no match sentences                                                                  capable of producing excellent translation
                                                              Myth: Machine
        can be pre-translated in a machine translation                                                                 results in the right environment
        system before being edited by human translators.                                                               Machine translation has been around for more
                                                              translation systems can
        Benefits of machine-generated pre-translation:                                                                 than 50 years, and during this half century a wide
                                                              only handle word-for-
        • Translators always have a proposal to work                                                                   range of MT technologies have evolved, e.g.
                                                              word translation
            with instead of starting each new translation                                                              dictionary-based, rules-based, example-based,
            from scratch. A representative case study                                                                  statistical - plus countless hybrid forms. Here is a
            recently conducted at Symantec indicates          The belief that machine translation is basically         brief discussion of the three machine translation
            that the productivity of human translators        limited to the sequential substitution of words in       technologies that are most relevant for commer-
            can double when unknown sentences are             the source language with words in the target lan-        cial applications today.


                                                                                                                                                                        23
        AUGUST 2008




#5801_tcworld_04-08.indd 23                                                                                                                                 20.06.2008 14:17:22 Uhr
solutions



     Rules-based Machine Translation                      translation packages are available for dozens of       some of the rules-based systems, this MT techno-
     Rules-based machine translation, also known          language combinations, many languages are still        logy is primarily used by government agencies
     as transfer machine translation, is the dominant     not covered.                                           – the intelligence community in particular – and
     MT paradigm today. Systran, Babelfish, promt,                                                               large corporations.
     to name just a few, are all rules-based systems.     Statistical Machine Translation
     Rules-based MT systems use a three-stage trans-      Statistical machine translation (SMT) is getting a     Direct Machine Translation
     lation process:                                      lot of media attention these days, especially after    In its most primitive form, the only thing a direct
     1. Analysis: Parses the source sentence to create    Microsoft announced that it is using a proprietary     machine translation system does is to replace
         a tree of the syntactic structure of that sen-   SMT system to translate its huge Knowledge Base        the words in the source language with words in
                                                          document repository2 and Google won a large-
         tence.                                                                                                  the target language – in the same sequence and
     2. Transfer: Converts the syntactic tree for the     scale machine translation evaluation contest           without any linguistic analysis or processing. The
                                                          with its statistical machine translation engine.3
         source language into the corresponding tree                                                             only resource direct machine translation uses
         for the target language.                         Statistical machine translation systems typically      is a bilingual dictionary, which is why this MT
     3. Generation: Populates the target tree with        consist of two major components:                       technology is also known as dictionary-driven
         corresponding words to create a sentence in      • Translation Model: Generates translation             machine translation.
         the target language.                                 proposals based on corresponding word se-          Due to this rather unsophisticated technology,
     Benefits of rules-based machine translation              quences in aligned source and target training      direct machine translation has been considered
     include:                                                 data.                                              obsolete for many years, and there are hardly any
     • Mature, proven technology that can be imple-       • Language Model: Selects the best translation         commercial products available that use direct MT.
         mented quickly and at relatively low cost.           proposal based on training data in the target      Despite its limited capabilities, I strongly believe
     • Many commercial systems available covering             language only.                                     that direct machine translation still has a place
         many language combinations.                      The good news about statistical machine                in today’s arsenal of automated translation tools.
     • Highly customizable through dictionary and         translation is that once an SMT system has been        For a number of common real-world applications,
         style settings (some systems also support the    trained on customer-specific data, this is the MT      word-for-word or phrase-for-phrase substitution
         customization of the rules base).                technology that typically produces the highest         is all that is required for successful translation.
     Rules-based machine translation systems              translation quality. On the flip side, that training   Think of domains where both vocabulary and
     have been in use in commercial settings for          effort requires a substantial body of existing         syntax are standardized, as is the case with
     many years, e.g. at Autodesk, Daimler, and the       translations: Language Weaver, the leading             weather reports, financial profiles, and many
     European Commission’s Translation Service.           vendor of statistical machine translation systems,     e-commerce applications.
     The two primary challenges for rules-based MT        recommends a bilingual corpus of two million           In one recent implementation, Medtronic, a
     are first, that the rules base of any system is by   words or more per language pair. Because of the        large medical device manufacturer, used direct
     necessity limited, meaning that for best results,    demanding training requirements, combined              machine translation to translate a large product
                                                                                                                 database into multiple languages.4 Human trans-
     authors need to adjust their writing style, and      with the fact that statistical machine translation
     second, while commercial rules-based machine         systems tend to have a higher sticker price than       lation was not an option for this project because



Flare without Help is like Help without Flare




                                                                                                                                 single package!
                                                                                                                          Request your free demo versions now!
                                                                                                                                    www.cognitas.de




     24                                                                                                                                             AUGUST 2008


                                                                                         + 49                            Contact:

#5801_tcworld_04-08.indd 24                                                                                                                              20.06.2008 14:17:24 Uhr
solutions



        of cost and, yes, quality concerns (an analysis of     ons may differ in many ways, the core translation
        previous human translation projects indicated an       engine is typically the same in both products. In         Sources:
        unacceptably high error rate among numeric va-         other words: In terms of out-of-the-box translati-         1 Systran Software Inc. 2007. Systran
        lues such as product numbers and dimensions).          on quality, there is generally little if any difference   Case Study: Symantec. Systran Software Inc.
        Also, initial tests had shown that both translation    between the 1000 dollar professional version              Web site. [Online] 2007. [Cited: June 6, 2008.]
        memories and rules-based machine translation           and the 50000 dollar corporate version of a given         www.systransoft.com/download/case-stu-
        systems produced poor results with text that has       machine translation product.                              dies/2007.12.Symantec.pdf.
        the following characteristics:                         In addition, the developers of commercial ma-              2 Microsoft Corporation. 2008. Machine
        – little or no repetition on the sentence level;       chine translation systems have invested heavily           Translation - Home. Microsoft Corporation Web
        – high repetition on the word/phrase level;            into making their products as intuitive to use as         site. [Online] 2008. [Cited: June 6, 2008.] http://
        – telegraphic/elliptic style, e.g. ‘winds from         possible. In fact, I would even say that it is easier     research.microsoft.com/nlp/projects/mtproj.
        southerly direction, speed reaching 55 km/h’,          – and certainly faster – to produce your first trans-     aspx.
        ‘American Technology Associates (AMTA) strong          lation with a typical MT product than it is with the       3 Institute of Standards and Technolo-
        buy, Avion (AVIO) market outperform’, or ‘plate        typical translation memory tool.                          gy. 2006. NIST 2006 Machine Translation
        2456dr15 right-angled, slotted, 15 ea’.                A few more facts to consider:                             Evaluation Official Results. National Instititue
        This type of translation project is most definitely    • Many low-priced machine translation pro-                of Standards and Technology Web site. [Online]
        among those that any self-respecting human                 ducts either feature a built-in translation me-       November 1, 2006. [Cited: June 6, 2008.]
        translator could easily do without. And since              mory (TM) module to improve the efficiency            http://www.nist.gov/speech/tests/mt/2006/
        direct machine translation does not require                of the post-editing process (‘never correct the       doc/mt06eval_official_results.html.
        human post-editing in a best case scenario, using          same mistake twice’), and a few MT systems             4 Fully Automatic High Quality Machine Trans-
        MT in this kind of environment might for once              like promt Expert offer seamless integration          lation of Restricted Text: A Case Study. Muegge,
        be welcomed by translators (who would hate to              with the SDL Trados translation memory                Uwe. 2006. London: The Association of
        do these translations themselves) and translati-           system.                                               Information Management (Aslib), 2006. Pro-
        on buyers (who would love the idea of almost           • A number of translation tools vendors, such             ceedings of the Twenty-eighth International
        instant, almost free translations).                        as Across, that cater to small and mid-sized          Conference on Translating and the Computer.
                                                                   companies, offer TM-MT system bundles and/            ISBN 978-0-85142-5.
        Myth: Machine                                              or MT integration via API.
                                                               • User education and MT system customization
        translation is only for                                    (e.g. building dictionaries), which are major fa
        large organizations                                        ctors in achieving the best possible transla-
                                                                   tion results, are often easier to accomplish in
        Yes, it is true: If you read any success stories           smaller organizations than in larger ones.
        about machine translation, they typically come
        from the Caterpillars, Microsofts, and Symantecs
                                                               The bottom line
        of this world. But that is true for many - if not
        most - emerging technologies. It is also true that
        some of the most powerful machine transla-             Since its inception, machine translation has been
        tion systems in use today are the result of the        a highly controversial technology, and it will
                                                                                                                         contact
        multi-million dollar research and development          probably continue to be so for some time. Much
        programs only corporate giants can afford. But         of this controversy is based on false assumptions
        that does not mean you have to spend big bucks         about what machine translation can do and who                                   Uwe Muegge is the cor-
        to deploy a machine translation solution.              might benefit from using this type of technology.                               porate terminologist at
                                                               Let me say it loud and clear: In general, the com-                              Medtronic, a manufacturer
        Fact: Being both affordable and user-friendly,         mercial machine translation systems available                                   of medical technology.
        many machine translation packages are                  today cannot replace human translators, especial-                               He serves in ISO Technical
        available for even the smallest of businesses,         ly when those MT systems are operated by users                                  Committee 37 SC3 Compu-
        including freelancers                                  who have no linguistic background. However,               ter Applications in Termnology and teaches Ter-
        If you do a little research, you will find that many   when the goal is to improve the efficiency of the         minology Management and Computer-Assisted
        commercial machine translation packages are in         human translation process or to create compre-            Translation at the Monterey Institute of Interna-
        the same price range as their translation memory       hensible translations in environments where hu-           tional Studies in Monterey, California.
        counterparts, and that is mostly true for both         man translation is not an option, and when these
        workstation solutions for single users and client-     systems are operated by trained and motivated             info@muegge.cc
        server solutions for many users. And the secret is     translation professionals, then machine translati-        www.muegge.cc
        out that while corporate and small-business versi-     on is and has been a very powerful solution.


                                                                                                                                                                            25
        AUGUST 2008




#5801_tcworld_04-08.indd 25                                                                                                                                   20.06.2008 14:17:26 Uhr

Más contenido relacionado

La actualidad más candente

Computer Programming: Chapter 1
Computer Programming: Chapter 1Computer Programming: Chapter 1
Computer Programming: Chapter 1Atit Patumvan
 
Controlled Language
Controlled LanguageControlled Language
Controlled LanguageUwe Muegge
 
Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...SDL
 
Programming languages and paradigms
Programming languages and paradigmsProgramming languages and paradigms
Programming languages and paradigmsJohn Paul Hallasgo
 
Improving the role of language model in statistical machine translation (Indo...
Improving the role of language model in statistical machine translation (Indo...Improving the role of language model in statistical machine translation (Indo...
Improving the role of language model in statistical machine translation (Indo...IJECEIAES
 
Theory Psyco
Theory PsycoTheory Psyco
Theory Psycodidip
 
The Latest Advances in Patent Machine Translation
The Latest Advances in Patent Machine TranslationThe Latest Advances in Patent Machine Translation
The Latest Advances in Patent Machine TranslationIconic Translation Machines
 
D turner techreport
D turner techreportD turner techreport
D turner techreportdavid114811
 

La actualidad más candente (20)

SYSTEM DEVELOPMENT
SYSTEM DEVELOPMENTSYSTEM DEVELOPMENT
SYSTEM DEVELOPMENT
 
Richa garg itm
Richa garg itmRicha garg itm
Richa garg itm
 
Computer Programming: Chapter 1
Computer Programming: Chapter 1Computer Programming: Chapter 1
Computer Programming: Chapter 1
 
Controlled Language
Controlled LanguageControlled Language
Controlled Language
 
Fundamentals of Programming Chapter 2
Fundamentals of Programming Chapter 2Fundamentals of Programming Chapter 2
Fundamentals of Programming Chapter 2
 
Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...
 
Computer
ComputerComputer
Computer
 
Programming languages and paradigms
Programming languages and paradigmsProgramming languages and paradigms
Programming languages and paradigms
 
Rise of software
Rise of softwareRise of software
Rise of software
 
Computer Programming - Lecture 1
Computer Programming - Lecture 1Computer Programming - Lecture 1
Computer Programming - Lecture 1
 
How to Translate from English to Khmer using Moses
How to Translate from English to Khmer using MosesHow to Translate from English to Khmer using Moses
How to Translate from English to Khmer using Moses
 
Programming languages
Programming languagesProgramming languages
Programming languages
 
Improving the role of language model in statistical machine translation (Indo...
Improving the role of language model in statistical machine translation (Indo...Improving the role of language model in statistical machine translation (Indo...
Improving the role of language model in statistical machine translation (Indo...
 
IRJET- Vocal Code
IRJET- Vocal CodeIRJET- Vocal Code
IRJET- Vocal Code
 
Theory Psyco
Theory PsycoTheory Psyco
Theory Psyco
 
Languages in computer
Languages in computerLanguages in computer
Languages in computer
 
The Latest Advances in Patent Machine Translation
The Latest Advances in Patent Machine TranslationThe Latest Advances in Patent Machine Translation
The Latest Advances in Patent Machine Translation
 
Ppl 13 july2019
Ppl 13 july2019Ppl 13 july2019
Ppl 13 july2019
 
TermWiki
TermWikiTermWiki
TermWiki
 
D turner techreport
D turner techreportD turner techreport
D turner techreport
 

Similar a Machine Translation

Man vs. Machine: A Guide to Understanding Translation Technology in Modern Bu...
Man vs. Machine: A Guide to Understanding Translation Technology in Modern Bu...Man vs. Machine: A Guide to Understanding Translation Technology in Modern Bu...
Man vs. Machine: A Guide to Understanding Translation Technology in Modern Bu...Language Department
 
The Role Of Translators In MT: EU 2010
The Role Of Translators In MT:  EU 2010The Role Of Translators In MT:  EU 2010
The Role Of Translators In MT: EU 2010LoriThicke
 
Machine Translation: What it is?
Machine Translation: What it is?Machine Translation: What it is?
Machine Translation: What it is?Multilizer
 
Muegge_Do-it-yourself MT_Taking statistical machine translation to the next l...
Muegge_Do-it-yourself MT_Taking statistical machine translation to the next l...Muegge_Do-it-yourself MT_Taking statistical machine translation to the next l...
Muegge_Do-it-yourself MT_Taking statistical machine translation to the next l...Uwe Muegge
 
Lexcelera MT Breaking Compromises
Lexcelera MT Breaking CompromisesLexcelera MT Breaking Compromises
Lexcelera MT Breaking CompromisesLoriThicke
 
Creating a compiler for your own language
Creating a compiler for your own languageCreating a compiler for your own language
Creating a compiler for your own languageAndrea Tino
 
Zerfass trends in translation technologies
Zerfass trends in translation technologiesZerfass trends in translation technologies
Zerfass trends in translation technologiesascetlan
 
Instant speech translation 10BM60080 - VGSOM
Instant speech translation   10BM60080 - VGSOMInstant speech translation   10BM60080 - VGSOM
Instant speech translation 10BM60080 - VGSOMsathiyaseelanm
 
Savings of 83% thanks to CAT tools... [case study]
Savings of 83% thanks to CAT tools... [case study]Savings of 83% thanks to CAT tools... [case study]
Savings of 83% thanks to CAT tools... [case study]Tradas
 
Lingotek Translation Platform
Lingotek Translation PlatformLingotek Translation Platform
Lingotek Translation Platformjdfoote
 
Machine Translation Master Class at the EUATC Conference by Diego Bartolome
Machine Translation Master Class at the EUATC Conference by Diego BartolomeMachine Translation Master Class at the EUATC Conference by Diego Bartolome
Machine Translation Master Class at the EUATC Conference by Diego Bartolometauyou
 
MT(1).pdf
MT(1).pdfMT(1).pdf
MT(1).pdfs n
 
Trends In Technology: Worldware 2010
Trends In Technology:  Worldware 2010Trends In Technology:  Worldware 2010
Trends In Technology: Worldware 2010LoriThicke
 
Multi lingual corpus for machine aided translation
Multi lingual corpus for machine aided translationMulti lingual corpus for machine aided translation
Multi lingual corpus for machine aided translationAashna Phanda
 
Methods for Handling Terminology in Machine Translation
Methods for Handling Terminology in Machine TranslationMethods for Handling Terminology in Machine Translation
Methods for Handling Terminology in Machine TranslationKerstin Berns
 
Techniques in translation, computer assisted, machine translation, subtitling...
Techniques in translation, computer assisted, machine translation, subtitling...Techniques in translation, computer assisted, machine translation, subtitling...
Techniques in translation, computer assisted, machine translation, subtitling...Moses Altovar
 

Similar a Machine Translation (20)

machine transaltion
machine transaltionmachine transaltion
machine transaltion
 
Man vs. Machine: A Guide to Understanding Translation Technology in Modern Bu...
Man vs. Machine: A Guide to Understanding Translation Technology in Modern Bu...Man vs. Machine: A Guide to Understanding Translation Technology in Modern Bu...
Man vs. Machine: A Guide to Understanding Translation Technology in Modern Bu...
 
The Role Of Translators In MT: EU 2010
The Role Of Translators In MT:  EU 2010The Role Of Translators In MT:  EU 2010
The Role Of Translators In MT: EU 2010
 
Machine Translation: What it is?
Machine Translation: What it is?Machine Translation: What it is?
Machine Translation: What it is?
 
Muegge_Do-it-yourself MT_Taking statistical machine translation to the next l...
Muegge_Do-it-yourself MT_Taking statistical machine translation to the next l...Muegge_Do-it-yourself MT_Taking statistical machine translation to the next l...
Muegge_Do-it-yourself MT_Taking statistical machine translation to the next l...
 
Lexcelera MT Breaking Compromises
Lexcelera MT Breaking CompromisesLexcelera MT Breaking Compromises
Lexcelera MT Breaking Compromises
 
Creating a compiler for your own language
Creating a compiler for your own languageCreating a compiler for your own language
Creating a compiler for your own language
 
Machine Tanslation
Machine TanslationMachine Tanslation
Machine Tanslation
 
Zerfass trends in translation technologies
Zerfass trends in translation technologiesZerfass trends in translation technologies
Zerfass trends in translation technologies
 
Instant speech translation 10BM60080 - VGSOM
Instant speech translation   10BM60080 - VGSOMInstant speech translation   10BM60080 - VGSOM
Instant speech translation 10BM60080 - VGSOM
 
Savings of 83% thanks to CAT tools... [case study]
Savings of 83% thanks to CAT tools... [case study]Savings of 83% thanks to CAT tools... [case study]
Savings of 83% thanks to CAT tools... [case study]
 
Lingotek Translation Platform
Lingotek Translation PlatformLingotek Translation Platform
Lingotek Translation Platform
 
CAT TOOLS.ppt
CAT TOOLS.pptCAT TOOLS.ppt
CAT TOOLS.ppt
 
Machine Translation Master Class at the EUATC Conference by Diego Bartolome
Machine Translation Master Class at the EUATC Conference by Diego BartolomeMachine Translation Master Class at the EUATC Conference by Diego Bartolome
Machine Translation Master Class at the EUATC Conference by Diego Bartolome
 
MT(1).pdf
MT(1).pdfMT(1).pdf
MT(1).pdf
 
Trends In Technology: Worldware 2010
Trends In Technology:  Worldware 2010Trends In Technology:  Worldware 2010
Trends In Technology: Worldware 2010
 
Multi lingual corpus for machine aided translation
Multi lingual corpus for machine aided translationMulti lingual corpus for machine aided translation
Multi lingual corpus for machine aided translation
 
Methods for Handling Terminology in Machine Translation
Methods for Handling Terminology in Machine TranslationMethods for Handling Terminology in Machine Translation
Methods for Handling Terminology in Machine Translation
 
Techniques in translation, computer assisted, machine translation, subtitling...
Techniques in translation, computer assisted, machine translation, subtitling...Techniques in translation, computer assisted, machine translation, subtitling...
Techniques in translation, computer assisted, machine translation, subtitling...
 
VOICE BROWSER
VOICE BROWSERVOICE BROWSER
VOICE BROWSER
 

Último

UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UbiTrack UK
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?IES VE
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1DianaGray10
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Adtran
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...DianaGray10
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioChristian Posta
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfinfogdgmi
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1DianaGray10
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemAsko Soukka
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesDavid Newbury
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8DianaGray10
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDELiveplex
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfJamie (Taka) Wang
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-pyJamie (Taka) Wang
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxGDSC PJATK
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintMahmoud Rabie
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostMatt Ray
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...Aggregage
 

Último (20)

UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
UWB Technology for Enhanced Indoor and Outdoor Positioning in Physiological M...
 
How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?How Accurate are Carbon Emissions Projections?
How Accurate are Carbon Emissions Projections?
 
20230104 - machine vision
20230104 - machine vision20230104 - machine vision
20230104 - machine vision
 
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1UiPath Platform: The Backend Engine Powering Your Automation - Session 1
UiPath Platform: The Backend Engine Powering Your Automation - Session 1
 
Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™Meet the new FSP 3000 M-Flex800™
Meet the new FSP 3000 M-Flex800™
 
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
Connector Corner: Extending LLM automation use cases with UiPath GenAI connec...
 
Comparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and IstioComparing Sidecar-less Service Mesh from Cilium and Istio
Comparing Sidecar-less Service Mesh from Cilium and Istio
 
201610817 - edge part1
201610817 - edge part1201610817 - edge part1
201610817 - edge part1
 
Videogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdfVideogame localization & technology_ how to enhance the power of translation.pdf
Videogame localization & technology_ how to enhance the power of translation.pdf
 
Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1Secure your environment with UiPath and CyberArk technologies - Session 1
Secure your environment with UiPath and CyberArk technologies - Session 1
 
Bird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystemBird eye's view on Camunda open source ecosystem
Bird eye's view on Camunda open source ecosystem
 
Linked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond OntologiesLinked Data in Production: Moving Beyond Ontologies
Linked Data in Production: Moving Beyond Ontologies
 
UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8UiPath Studio Web workshop series - Day 8
UiPath Studio Web workshop series - Day 8
 
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDEADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
ADOPTING WEB 3 FOR YOUR BUSINESS: A STEP-BY-STEP GUIDE
 
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
activity_diagram_combine_v4_20190827.pdfactivity_diagram_combine_v4_20190827.pdf
 
20230202 - Introduction to tis-py
20230202 - Introduction to tis-py20230202 - Introduction to tis-py
20230202 - Introduction to tis-py
 
Cybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptxCybersecurity Workshop #1.pptx
Cybersecurity Workshop #1.pptx
 
Empowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership BlueprintEmpowering Africa's Next Generation: The AI Leadership Blueprint
Empowering Africa's Next Generation: The AI Leadership Blueprint
 
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCostKubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
KubeConEU24-Monitoring Kubernetes and Cloud Spend with OpenCost
 
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
The Data Metaverse: Unpacking the Roles, Use Cases, and Tech Trends in Data a...
 

Machine Translation

  • 1. solutions Dispelling the myths of machine translation It is not surprising that myths, half-truths, and misunderstandings abound regarding machine translation: It seems as if the experience most players in the translation field have with this technology does not go beyond toying a little with one of the free online translation tools. Almost every week, I come across an article informing its readers either that machine translation is and always will be a complete waste of time or that machine translation, while being a waste of time today, might actually be useful some time in the distant future. In the hope of setting the record straight, here is a closer look at some of the most common myths about machine translation. Photo: Vasiliy Koval 22 AUGUST 2008 #5801_tcworld_04-08.indd 22 20.06.2008 14:17:19 Uhr
  • 2. solutions By Uwe Muegge pre-translated in a machine translation system.1 • While most translations will Myth: Machine require some editing and many even rewriting, it is fair to expect translation simply does that a considerable percentage not work of machine-generated trans- lations turn out to be perfect With free online translation services available all (this is especially true for short over the web, anyone can run a text through a instructions, headings, legends, machine translation (MT) engine and then share and the like). the results with the public as proof of the fact • ?At a minimum, key terms will that machine translation is capable of little more be translated correctly and than the most rudimentary rough translations consistently. And not only that, (gisting), and, of course, providing nearly endless in most cases these terms will entertainment. also be inflected correctly and The main problem with these ‘tests’ is that using appear in the correct singular or any of the free online translation environments plural form (try to do that with gives only a glimpse of the true power of a full- your translation memory!) fledged professional machine translation system. Example of a German>English machine translation from the author’s For example, the typical online translation service Fact: Machine translation website www.muegge.cc does not allow users to select a subject field or enables the translation of provide user terminology, let alone set stylistic material that would otherwise preferences. In fact, many - if not most - of the not be translated free text translation tools support no translation Very few organizations, if any, parameters other than the specification of the currently translate all materials that language pair and the source text. No wonder would benefit from translation into that the translations these machine translation all the languages spoken by all of websites produce can be so ridiculously off target. their current or future customers. The primary reason for this is that Fact: Machine translation improves the for many types of documents, German search page for the Microsoft Knowledge Base with machine productivity and consistency of human especially in the after-sales domain, translation option enabled translators the budget is simply not available Whenever new source text for a project is created, for large-scale human translation. that text will have to be translated at some point. A number of organizations are using machine guage is as widely-held as it is wrong. All popular Even when you work in what is considered a translation solutions for making large volumes machine translation systems, including the free state-of-the-art globalization environment, i.e. of text available to their global customers in online translation services such as systransoft. an integrated content management/translation their local language without involving any com, translate.google.com, and windowsli- workflow system, you will end up with a certain human translators in the process. The Microsoft vetranslator.com employ highly sophisticated percentage of low match/no match sentences. Knowledge Base, which contains more than algorithms that are the result of years of research In a well-planned and well-managed globalizati- 200000 documents in English, is a well-known and development. on project where writers, as well as software de- example of a text repository where the number velopers, use a comprehensive project glossary, of machine-translated documents by far exceeds Fact: There is not one but many very different as well as a style guide aimed at easy readability/ the number of those translated by humans. machine translation technologies that are all comprehensibility, the low/no match sentences capable of producing excellent translation Myth: Machine can be pre-translated in a machine translation results in the right environment system before being edited by human translators. Machine translation has been around for more translation systems can Benefits of machine-generated pre-translation: than 50 years, and during this half century a wide only handle word-for- • Translators always have a proposal to work range of MT technologies have evolved, e.g. word translation with instead of starting each new translation dictionary-based, rules-based, example-based, from scratch. A representative case study statistical - plus countless hybrid forms. Here is a recently conducted at Symantec indicates The belief that machine translation is basically brief discussion of the three machine translation that the productivity of human translators limited to the sequential substitution of words in technologies that are most relevant for commer- can double when unknown sentences are the source language with words in the target lan- cial applications today. 23 AUGUST 2008 #5801_tcworld_04-08.indd 23 20.06.2008 14:17:22 Uhr
  • 3. solutions Rules-based Machine Translation translation packages are available for dozens of some of the rules-based systems, this MT techno- Rules-based machine translation, also known language combinations, many languages are still logy is primarily used by government agencies as transfer machine translation, is the dominant not covered. – the intelligence community in particular – and MT paradigm today. Systran, Babelfish, promt, large corporations. to name just a few, are all rules-based systems. Statistical Machine Translation Rules-based MT systems use a three-stage trans- Statistical machine translation (SMT) is getting a Direct Machine Translation lation process: lot of media attention these days, especially after In its most primitive form, the only thing a direct 1. Analysis: Parses the source sentence to create Microsoft announced that it is using a proprietary machine translation system does is to replace a tree of the syntactic structure of that sen- SMT system to translate its huge Knowledge Base the words in the source language with words in document repository2 and Google won a large- tence. the target language – in the same sequence and 2. Transfer: Converts the syntactic tree for the scale machine translation evaluation contest without any linguistic analysis or processing. The with its statistical machine translation engine.3 source language into the corresponding tree only resource direct machine translation uses for the target language. Statistical machine translation systems typically is a bilingual dictionary, which is why this MT 3. Generation: Populates the target tree with consist of two major components: technology is also known as dictionary-driven corresponding words to create a sentence in • Translation Model: Generates translation machine translation. the target language. proposals based on corresponding word se- Due to this rather unsophisticated technology, Benefits of rules-based machine translation quences in aligned source and target training direct machine translation has been considered include: data. obsolete for many years, and there are hardly any • Mature, proven technology that can be imple- • Language Model: Selects the best translation commercial products available that use direct MT. mented quickly and at relatively low cost. proposal based on training data in the target Despite its limited capabilities, I strongly believe • Many commercial systems available covering language only. that direct machine translation still has a place many language combinations. The good news about statistical machine in today’s arsenal of automated translation tools. • Highly customizable through dictionary and translation is that once an SMT system has been For a number of common real-world applications, style settings (some systems also support the trained on customer-specific data, this is the MT word-for-word or phrase-for-phrase substitution customization of the rules base). technology that typically produces the highest is all that is required for successful translation. Rules-based machine translation systems translation quality. On the flip side, that training Think of domains where both vocabulary and have been in use in commercial settings for effort requires a substantial body of existing syntax are standardized, as is the case with many years, e.g. at Autodesk, Daimler, and the translations: Language Weaver, the leading weather reports, financial profiles, and many European Commission’s Translation Service. vendor of statistical machine translation systems, e-commerce applications. The two primary challenges for rules-based MT recommends a bilingual corpus of two million In one recent implementation, Medtronic, a are first, that the rules base of any system is by words or more per language pair. Because of the large medical device manufacturer, used direct necessity limited, meaning that for best results, demanding training requirements, combined machine translation to translate a large product database into multiple languages.4 Human trans- authors need to adjust their writing style, and with the fact that statistical machine translation second, while commercial rules-based machine systems tend to have a higher sticker price than lation was not an option for this project because Flare without Help is like Help without Flare single package! Request your free demo versions now! www.cognitas.de 24 AUGUST 2008 + 49 Contact: #5801_tcworld_04-08.indd 24 20.06.2008 14:17:24 Uhr
  • 4. solutions of cost and, yes, quality concerns (an analysis of ons may differ in many ways, the core translation previous human translation projects indicated an engine is typically the same in both products. In Sources: unacceptably high error rate among numeric va- other words: In terms of out-of-the-box translati- 1 Systran Software Inc. 2007. Systran lues such as product numbers and dimensions). on quality, there is generally little if any difference Case Study: Symantec. Systran Software Inc. Also, initial tests had shown that both translation between the 1000 dollar professional version Web site. [Online] 2007. [Cited: June 6, 2008.] memories and rules-based machine translation and the 50000 dollar corporate version of a given www.systransoft.com/download/case-stu- systems produced poor results with text that has machine translation product. dies/2007.12.Symantec.pdf. the following characteristics: In addition, the developers of commercial ma- 2 Microsoft Corporation. 2008. Machine – little or no repetition on the sentence level; chine translation systems have invested heavily Translation - Home. Microsoft Corporation Web – high repetition on the word/phrase level; into making their products as intuitive to use as site. [Online] 2008. [Cited: June 6, 2008.] http:// – telegraphic/elliptic style, e.g. ‘winds from possible. In fact, I would even say that it is easier research.microsoft.com/nlp/projects/mtproj. southerly direction, speed reaching 55 km/h’, – and certainly faster – to produce your first trans- aspx. ‘American Technology Associates (AMTA) strong lation with a typical MT product than it is with the 3 Institute of Standards and Technolo- buy, Avion (AVIO) market outperform’, or ‘plate typical translation memory tool. gy. 2006. NIST 2006 Machine Translation 2456dr15 right-angled, slotted, 15 ea’. A few more facts to consider: Evaluation Official Results. National Instititue This type of translation project is most definitely • Many low-priced machine translation pro- of Standards and Technology Web site. [Online] among those that any self-respecting human ducts either feature a built-in translation me- November 1, 2006. [Cited: June 6, 2008.] translator could easily do without. And since mory (TM) module to improve the efficiency http://www.nist.gov/speech/tests/mt/2006/ direct machine translation does not require of the post-editing process (‘never correct the doc/mt06eval_official_results.html. human post-editing in a best case scenario, using same mistake twice’), and a few MT systems 4 Fully Automatic High Quality Machine Trans- MT in this kind of environment might for once like promt Expert offer seamless integration lation of Restricted Text: A Case Study. Muegge, be welcomed by translators (who would hate to with the SDL Trados translation memory Uwe. 2006. London: The Association of do these translations themselves) and translati- system. Information Management (Aslib), 2006. Pro- on buyers (who would love the idea of almost • A number of translation tools vendors, such ceedings of the Twenty-eighth International instant, almost free translations). as Across, that cater to small and mid-sized Conference on Translating and the Computer. companies, offer TM-MT system bundles and/ ISBN 978-0-85142-5. Myth: Machine or MT integration via API. • User education and MT system customization translation is only for (e.g. building dictionaries), which are major fa large organizations ctors in achieving the best possible transla- tion results, are often easier to accomplish in Yes, it is true: If you read any success stories smaller organizations than in larger ones. about machine translation, they typically come from the Caterpillars, Microsofts, and Symantecs The bottom line of this world. But that is true for many - if not most - emerging technologies. It is also true that some of the most powerful machine transla- Since its inception, machine translation has been tion systems in use today are the result of the a highly controversial technology, and it will contact multi-million dollar research and development probably continue to be so for some time. Much programs only corporate giants can afford. But of this controversy is based on false assumptions that does not mean you have to spend big bucks about what machine translation can do and who Uwe Muegge is the cor- to deploy a machine translation solution. might benefit from using this type of technology. porate terminologist at Let me say it loud and clear: In general, the com- Medtronic, a manufacturer Fact: Being both affordable and user-friendly, mercial machine translation systems available of medical technology. many machine translation packages are today cannot replace human translators, especial- He serves in ISO Technical available for even the smallest of businesses, ly when those MT systems are operated by users Committee 37 SC3 Compu- including freelancers who have no linguistic background. However, ter Applications in Termnology and teaches Ter- If you do a little research, you will find that many when the goal is to improve the efficiency of the minology Management and Computer-Assisted commercial machine translation packages are in human translation process or to create compre- Translation at the Monterey Institute of Interna- the same price range as their translation memory hensible translations in environments where hu- tional Studies in Monterey, California. counterparts, and that is mostly true for both man translation is not an option, and when these workstation solutions for single users and client- systems are operated by trained and motivated info@muegge.cc server solutions for many users. And the secret is translation professionals, then machine translati- www.muegge.cc out that while corporate and small-business versi- on is and has been a very powerful solution. 25 AUGUST 2008 #5801_tcworld_04-08.indd 25 20.06.2008 14:17:26 Uhr