SlideShare una empresa de Scribd logo
1 de 33
SDL Proprietary and Confidential
Machine Translation:
Latest Innovations and their
Impact on Commercial
Translation
SDL Customer Success Summit Montreal
Rodrigo Fuentes Corradi, MT Business Consultant
June, 2015
2
Agenda
○ Evolution of MT
○ Common MT Use-Cases
○ Engine Training
○ Introducing SDL XMT
○ How to Deploy MT
○ MT and the Post-Editor
Evolution of MT
4
1950s
2002
2010
2011
2015
SDL acquires RBMT
engine…establishes
MT group dedicated to
improving quality for
enterprise applications
First SDL Post-
Editing projects
using SMT go into
production
Post-Editing
booms: 4-fold
increase
SDL launches
PE Certification
Program
War-time
cryptography
requirements,
with subsequent
experiments &
investment in
automated
translation
SDL launches
XMT next-
generation MT
platform
2014
Brief history of Machine Translation
SDL acquires
Language Weaver / BeGlobal Statistical
Machine Translation (SMT)
5
Overview: The SDL MT Team
Who we are
First to commercialize Statistical
Machine Translation
o 50+ Professionals
o Over 10 Nationalities
o Across 5 Time Zones
o 8 Locations
o Ex-translators
o Computational
Linguists
o Project
Managers
Widespread team of language lovers:
o Data
Specialists
o Post-
Editors
o Architects
…all gathered from the
four corners of SDL!
What we do
Drive MT Adoption:
Educate, promote and support MT
usage in existing SDL accounts
& new opportunities
o Design
o Create
o Test
o Implement
o Monitor
Custom Engine Builds:
…custom
Statistical Machine
Translation
engines
Linguistic Projects:
Semantic annotation projects
for US Government bodies
& academic institutes
How we do it
o Los Angeles, CA
o Cambridge, UK
Two Research Labs:
o 100s of Scientific
Publications
o Over 50 Patents
Approved or Filed
We’re Evangelists…about
Machine Translation, using
automation to accelerate
productivity
Common MT Use-Cases
7
Communication
Channels
Consumer Preferences
Increased Global
Competition
Export Market Growth
8
Right translation method, right price, right time
Quality
Volume
Human Translation Machine Translation
Blogs
User Forums
Reviews
Chat
Email
Support
FAQ
Websites
Wikis
Knowledge
Base
Alerts/
Notifications
Help
User
Guides
Documentation
Post-Edit
Newsletters
Advertising Content
Legal
9
Description:
○ Direct access to machine translation
from SDL Trados Studio
Benefits:
○ Improve the efficiency of translators by
providing results of machine translation
to them for segments that do not match
entries in translation memory
Translator productivity
10
Description:
○ Real-time translation of web-based
chat conversations
Benefits:
o Reduces cost of staffing the
support/sales operations as they
do not need multi-lingual agents
o Customer acquisition rates and
satisfaction are much higher if you
engage the customer in chat.
Live chat translation
11
Description:
○ Translation of user-generated content
in web-based community forums
Benefits:
o Enable interactions between
customers who speak different
languages
o Leverage community expertise
across languages instead of only
within the language of community
experts
Community forum translation
12
Description:
○ Translation of knowledge base content
for local language customers of technical
solutions
Benefits:
o Reduces customer support costs
and activity level by allowing remote
language customers to directly
access solutions
o Increases customer satisfaction by
providing solutions in their native
language
Knowledgebase content translation
13
Case study: MT for online customer reviews
Requirements:
o Share customer reviews with
international audiences
o Automate the translation of customer
reviews into 13 languages
Results:
o Reduced bounce rate from 70% to 25%
o Increased user dwell times and page views
o Economically translate 2 billion words/month
14
Case study: MT for instant MS Office translation
[a large global
retail client]
Requirements:
o Improve communication among
geographically scattered company
employees
o Fast, low-cost translation of MS Outlook
emails & MS Office business documents
Results:
o BeGlobal Machine Translation integrated
via API with MS Office apps
o Any employee can instantly translate emails
or attachments with a simple double-click
15
Engine training: Making MT smarter
Customized engines
Domain verticals
Baselines
16
Baselines
Baselines
Data mined
from reliable
sources
available in the
public domain,
covering various
subjects
Core generic MT
engines for each
language pair
Work well for
general & varied
content
Can be used
as backup for
verticals &
customized
engines
Contain
hundreds of
millions of words
of bilingual data
100Ms+
17
Domain verticals
Domain verticals
Trained statistical
engines exclusive
for a domain
Data selected from
sources within a
domain or industry
MT output more
likely to follow
technical
terminology
Solution used when
client-specific data
is not available or
not enough for a
customization
18
Customized engines
Customized engines
Optimize the MT
output for
specific client
projects
Training based
on client-
specific
bilingual data
More data
usually has a
positive effect
on the MT
output
Quality &
consistency
of data is as
important as
quantity
Adherence to
client-specific
terminology
& style
19
How SDL trains an MT engine
Training Data Prep &
Engine Customization
Prep of Testing
Material
Evaluate MT Output
Machine
Translation
Post-Edit
Quality
Assessment
& Translation
Delivery
Update
Translation
Memory
Source
Content
Apply
Translation
Memory
Content Evaluation MT Customization Production QA
Refine Training or Deploy
for Production
Integrate MT on
Translation Process
SDL MT
Server
Translation
Memory
20
SDL MT Group developers are constantly
researching ways to improve Generic,
Vertical, and Customized MT Engines
SDL Research Scientists are continuously
improving the Statistical Machine Translation
algorithms (e.g. Language Models, Translation
Models, Reordering Models, Syntax,
Transliteration, Rule-Based Components, etc…)
SDL Data Engineers are
continuously mining large
amounts of good data used
by the statistical algorithms
Continuous improvement
21
Introducing SDL XMT…
A NEW, modular & flexible
technology that will power the
“next generation” of SDL MT
Syntax-based
Machine
Translation
Phrase-based
Machine
Translation
Word-based
Machine
Translation
2002
2003
2008
2015
XMT
XMT
22
Legacy MT
Legacy MT
(Monolithic
Phrase-based)
Foreign
Language
Your
Language
23
……
Neural
Networks
Compound
Splitting
Phrase-
Based
Finite
State
Automata
String
to Tree
Rule-
Based
Tree to
String
Pre-
Ordering
Trans-
literation
Hidden
Markov
Model
Hyper
Graphs
Modular &
Flexible
“State-of-the-Art”
Machine Learning
Better Translation
Quality
Rapid Research
Transition
SDL XMT: Next generation technology, higher quality
XMT
Foreign
Language
Your
Language
M O D U L A R C O M P O N E N T S
24
Language Learning in XMT
Continuous
improvement by
learning from
Post-Editing.
○ The machine learns how
to translate from source to
target during the training
process
○ The machine does
not learn during the
translation process
Machine Translation
Machine Translation
+ Language Learning
○ The machine learns how
to translate from source
to target during the
training process
○ The machine learns &
improves seamlessly,
continuously, and in
real-time from user
feedback during the
translation process
○ See it in action: SDL XMT
XMT
How to Deploy
MT Post-Edit
26
Post-Editing experience in Montreal
Quality delivered & owned by SDL, therefore commitment to quality
remains our number #1 priority !
o Costs reductions
up to 40% vs.
conventional
translation
o File
Formats
received,
TXT, XL,
and XML
o Unique client-specific
process developed
with collaboration
of engineering & IT
Teams from SDL
& customers
SDL
Canada
Post-Edited
Post-Editing Large Retail
Customers e-Commerce Sites
Post-Edited
(Forecasted)
2013 2014
2015
25M
10M 15M
Words
Words
Words
40%
27
Quality in MT
Building blocks are
there as a lot of
content is pulled
from the engines
Allows the linguist
to focus on
refining the output
Custom engines
pull in client
terminology & style
Fewer resources
equals greater
consistency
Trained linguists
well-versed in
handling MT
output & certified
28
Post-Editing quality requirements
When post-editing to publishable quality,
the following basic principles still apply:
o The same
references must
be used for as
for conventional
translation (project-
specific guidelines,
TMs, glossaries,
termbases, etc.)
o Grammar,
spelling and
punctuation
must be correct
o Appropriate
style & correct
terminology
must be used
consistently
o The translation
must read well
and be suitable
for its intended
purpose
Customer
User Guide
29
Features to watch out for in SMT output…
Incorrect
Formatting
Additional or
Missing words
Words Not
Localized or
Wrong Flavor
Gender, Number,
Agreement or
Verb Inflection
Issues
Articles &
Prepositions
Syntax & Word
Order Issues
Wrong
Punctuation
Inconsistent or
Non-compliant
Terminology
Mistranslations
!
30
Post-Editing Machine Translation certification
○ The demand for MT solutions
is growing quickly & Post-
Editing is becoming a
mainstream skill for translators
○ In response, SDL have
created Post-Editing
Certification – released
in June 2014
○ 85% of in-house
staff completed the
Certification in 2014
○ 2,500+ freelancers
signed up for the course
○ The Certification covers the
theory behind Machine
Translation as well as practical
approaches to Post-Editing
○ Our Certification is for anyone
impacted by Post-Editing –
certified translators can offer
an extended skill set
JUNE 2014
85%
2,500+
31
SDL iMT: Key steps in the process
○ Evaluate content and translation assets
○ Train MT engines for your content or use existing solution
○ Configure the trained MT engines with SDL’s translation environment
(TMS, WS, Studio)
○ Post-edit the MT output to full publishable quality
○ SDL infrastructure to support these steps
Evaluate Train MT Configure Post-Edit
SDL Infrastructure
Copyright © 2008-2015 SDL plc. All rights reserved. All company names, brand names, trademarks,
service marks, images and logos are the property of their respective owners.
This presentation and its content are SDL confidential unless otherwise specified, and may not be
copied, used or distributed except as authorised by SDL.
Global Customer Experience Management

Más contenido relacionado

La actualidad más candente

Machine translation with statistical approach
Machine translation with statistical approachMachine translation with statistical approach
Machine translation with statistical approach
vini89
 
MT and Translator's Tools
MT and Translator's ToolsMT and Translator's Tools
MT and Translator's Tools
Jim O'Regan
 
Principles of-programming-languages-lecture-notes-
Principles of-programming-languages-lecture-notes-Principles of-programming-languages-lecture-notes-
Principles of-programming-languages-lecture-notes-
Krishna Sai
 

La actualidad más candente (20)

Moses
MosesMoses
Moses
 
2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?
2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?
2013 ALC Boston: Your Trained Moses SMT System doesn't work. What can you do?
 
The Latest Advances in Patent Machine Translation
The Latest Advances in Patent Machine TranslationThe Latest Advances in Patent Machine Translation
The Latest Advances in Patent Machine Translation
 
Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...
 
Machine translation with statistical approach
Machine translation with statistical approachMachine translation with statistical approach
Machine translation with statistical approach
 
MT and Translator's Tools
MT and Translator's ToolsMT and Translator's Tools
MT and Translator's Tools
 
Machine translation from English to Hindi
Machine translation from English to HindiMachine translation from English to Hindi
Machine translation from English to Hindi
 
Computer programing 111 lecture 1
Computer programing 111 lecture 1 Computer programing 111 lecture 1
Computer programing 111 lecture 1
 
2. Project Management - Alexandre Helle & Manuel Herranz (Pangeanic)
2. Project Management - Alexandre Helle & Manuel Herranz (Pangeanic)2. Project Management - Alexandre Helle & Manuel Herranz (Pangeanic)
2. Project Management - Alexandre Helle & Manuel Herranz (Pangeanic)
 
Fundamentals of Programming Chapter 2
Fundamentals of Programming Chapter 2Fundamentals of Programming Chapter 2
Fundamentals of Programming Chapter 2
 
High Volume, Rapid Turn Around Localization: Lessons Learned
High Volume, Rapid Turn Around Localization: Lessons LearnedHigh Volume, Rapid Turn Around Localization: Lessons Learned
High Volume, Rapid Turn Around Localization: Lessons Learned
 
Ppl 13 july2019
Ppl 13 july2019Ppl 13 july2019
Ppl 13 july2019
 
introduction to programming
introduction to programmingintroduction to programming
introduction to programming
 
Computer Programming: Chapter 1
Computer Programming: Chapter 1Computer Programming: Chapter 1
Computer Programming: Chapter 1
 
Principles of-programming-languages-lecture-notes-
Principles of-programming-languages-lecture-notes-Principles of-programming-languages-lecture-notes-
Principles of-programming-languages-lecture-notes-
 
A tutorial on Machine Translation
A tutorial on Machine TranslationA tutorial on Machine Translation
A tutorial on Machine Translation
 
Languages in computer
Languages in computerLanguages in computer
Languages in computer
 
Computer
ComputerComputer
Computer
 
Programming languages
Programming languagesProgramming languages
Programming languages
 
Programming languages
Programming languagesProgramming languages
Programming languages
 

Destacado

Types of machine translation
Types of machine translationTypes of machine translation
Types of machine translation
Rushdi Shams
 
Introducing cat tools
Introducing cat toolsIntroducing cat tools
Introducing cat tools
Adrian Brand
 

Destacado (20)

Types of machine translation
Types of machine translationTypes of machine translation
Types of machine translation
 
What is machine translation
What is machine translationWhat is machine translation
What is machine translation
 
Statistical machine translation for indian language copy
Statistical machine translation for indian language   copyStatistical machine translation for indian language   copy
Statistical machine translation for indian language copy
 
Machine Translation
Machine TranslationMachine Translation
Machine Translation
 
Machine Translation Introduction
Machine Translation IntroductionMachine Translation Introduction
Machine Translation Introduction
 
Introducing cat tools
Introducing cat toolsIntroducing cat tools
Introducing cat tools
 
How to write a Post-Editing Guide that will optimize your QA Process, by Uwe ...
How to write a Post-Editing Guide that will optimize your QA Process, by Uwe ...How to write a Post-Editing Guide that will optimize your QA Process, by Uwe ...
How to write a Post-Editing Guide that will optimize your QA Process, by Uwe ...
 
Extending Machine Translation in AEM
Extending Machine Translation in AEMExtending Machine Translation in AEM
Extending Machine Translation in AEM
 
Introduction to Machine translation - AEM
Introduction to Machine translation - AEMIntroduction to Machine translation - AEM
Introduction to Machine translation - AEM
 
Designing e-Learning Content for Localization
Designing e-Learning Content for LocalizationDesigning e-Learning Content for Localization
Designing e-Learning Content for Localization
 
Sec16.3: Reordering Integration
Sec16.3: Reordering IntegrationSec16.3: Reordering Integration
Sec16.3: Reordering Integration
 
Escaping style and script data
Escaping style and script dataEscaping style and script data
Escaping style and script data
 
Translationusing moses1
Translationusing moses1Translationusing moses1
Translationusing moses1
 
Summary of Rule-based Reordering Space in Statistical Machine Translation
Summary of Rule-based Reordering Space in Statistical Machine TranslationSummary of Rule-based Reordering Space in Statistical Machine Translation
Summary of Rule-based Reordering Space in Statistical Machine Translation
 
7. ebmt based on st sm
7. ebmt based on st sm7. ebmt based on st sm
7. ebmt based on st sm
 
Assamese to English Statistical Machine Translation
Assamese to English Statistical Machine TranslationAssamese to English Statistical Machine Translation
Assamese to English Statistical Machine Translation
 
Towards OpenLogos Hybrid Machine Translation - Anabela Barreiro
Towards OpenLogos Hybrid Machine Translation - Anabela BarreiroTowards OpenLogos Hybrid Machine Translation - Anabela Barreiro
Towards OpenLogos Hybrid Machine Translation - Anabela Barreiro
 
A statistical approach to machine translation
A statistical approach to machine translationA statistical approach to machine translation
A statistical approach to machine translation
 
Data Localization and Translation
Data Localization and TranslationData Localization and Translation
Data Localization and Translation
 
0G to 5Gl
0G to 5Gl0G to 5Gl
0G to 5Gl
 

Similar a Machine Translation: Latest Innovations and their Impact on Commercial Translation

Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
kantanmt
 
User Empowered Machine Translation. Dion Wiggins, Asia Online
User Empowered Machine Translation. Dion Wiggins, Asia OnlineUser Empowered Machine Translation. Dion Wiggins, Asia Online
User Empowered Machine Translation. Dion Wiggins, Asia Online
ABBYY Language Serivces
 
CAT or TMS Implementation: Calculation of the Number of Licenses and the Tota...
CAT or TMS Implementation: Calculation of the Number of Licenses and the Tota...CAT or TMS Implementation: Calculation of the Number of Licenses and the Tota...
CAT or TMS Implementation: Calculation of the Number of Licenses and the Tota...
ABBYY Language Serivces
 
Software Technical Development Proposal Powerpoint Presentation Slides
Software Technical Development Proposal Powerpoint Presentation SlidesSoftware Technical Development Proposal Powerpoint Presentation Slides
Software Technical Development Proposal Powerpoint Presentation Slides
SlideTeam
 
Resume_VikramMalik
Resume_VikramMalikResume_VikramMalik
Resume_VikramMalik
Vikram Malik
 

Similar a Machine Translation: Latest Innovations and their Impact on Commercial Translation (20)

iMT Language Solutions
iMT Language SolutionsiMT Language Solutions
iMT Language Solutions
 
Webinar automotive and engineering content 16.06.16
Webinar   automotive and engineering content 16.06.16Webinar   automotive and engineering content 16.06.16
Webinar automotive and engineering content 16.06.16
 
Managing Translation Memories for Engineering and Automotive Translation
Managing Translation Memories for Engineering and Automotive TranslationManaging Translation Memories for Engineering and Automotive Translation
Managing Translation Memories for Engineering and Automotive Translation
 
Lexcelera MT Breaking Compromises
Lexcelera MT Breaking CompromisesLexcelera MT Breaking Compromises
Lexcelera MT Breaking Compromises
 
5 challenges of scaling l10n workflows KantanMT/bmmt webinar
5 challenges of scaling l10n workflows KantanMT/bmmt webinar5 challenges of scaling l10n workflows KantanMT/bmmt webinar
5 challenges of scaling l10n workflows KantanMT/bmmt webinar
 
New Breakthroughs in Machine Transation Technology
New Breakthroughs in Machine Transation TechnologyNew Breakthroughs in Machine Transation Technology
New Breakthroughs in Machine Transation Technology
 
Language Quality Management: Models, Measures, Methodologies
Language Quality Management: Models, Measures, Methodologies Language Quality Management: Models, Measures, Methodologies
Language Quality Management: Models, Measures, Methodologies
 
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
 
Good Applications of Bad Machine Translation
Good Applications of Bad Machine TranslationGood Applications of Bad Machine Translation
Good Applications of Bad Machine Translation
 
User Empowered Machine Translation. Dion Wiggins, Asia Online
User Empowered Machine Translation. Dion Wiggins, Asia OnlineUser Empowered Machine Translation. Dion Wiggins, Asia Online
User Empowered Machine Translation. Dion Wiggins, Asia Online
 
Machine Translation Quality - Are We There Yet? - Dag Schmidtke (Microsoft)
Machine Translation Quality - Are We There Yet? - Dag Schmidtke (Microsoft)Machine Translation Quality - Are We There Yet? - Dag Schmidtke (Microsoft)
Machine Translation Quality - Are We There Yet? - Dag Schmidtke (Microsoft)
 
CAT or TMS Implementation: Calculation of the Number of Licenses and the Tota...
CAT or TMS Implementation: Calculation of the Number of Licenses and the Tota...CAT or TMS Implementation: Calculation of the Number of Licenses and the Tota...
CAT or TMS Implementation: Calculation of the Number of Licenses and the Tota...
 
Terminology Management Best Practices
Terminology Management Best PracticesTerminology Management Best Practices
Terminology Management Best Practices
 
Ccaps SAP 2009
Ccaps SAP 2009Ccaps SAP 2009
Ccaps SAP 2009
 
MT Use in Lingosail, by Yongpeng Wei, Lingosail
MT Use in Lingosail, by Yongpeng Wei, LingosailMT Use in Lingosail, by Yongpeng Wei, Lingosail
MT Use in Lingosail, by Yongpeng Wei, Lingosail
 
Software Technical Development Proposal Powerpoint Presentation Slides
Software Technical Development Proposal Powerpoint Presentation SlidesSoftware Technical Development Proposal Powerpoint Presentation Slides
Software Technical Development Proposal Powerpoint Presentation Slides
 
Introducing language technology in the editing process: How to do things righ...
Introducing language technology in the editing process: How to do things righ...Introducing language technology in the editing process: How to do things righ...
Introducing language technology in the editing process: How to do things righ...
 
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
 
What is the Economic Case for Machine Translation?
What is the Economic Case for Machine Translation?What is the Economic Case for Machine Translation?
What is the Economic Case for Machine Translation?
 
Resume_VikramMalik
Resume_VikramMalikResume_VikramMalik
Resume_VikramMalik
 

Más de SDL

Más de SDL (20)

Architecting Your Global Digital Experience House - Nicole Uhlig and Derek Pa...
Architecting Your Global Digital Experience House - Nicole Uhlig and Derek Pa...Architecting Your Global Digital Experience House - Nicole Uhlig and Derek Pa...
Architecting Your Global Digital Experience House - Nicole Uhlig and Derek Pa...
 
The Marketer's Dilemma in Today's Global Digital Era - Liesl Leary and Henry ...
The Marketer's Dilemma in Today's Global Digital Era - Liesl Leary and Henry ...The Marketer's Dilemma in Today's Global Digital Era - Liesl Leary and Henry ...
The Marketer's Dilemma in Today's Global Digital Era - Liesl Leary and Henry ...
 
SDL Vision for Digital Experience - Arjen van den Akker at SDL Connect 16
SDL Vision for Digital Experience - Arjen van den Akker at SDL Connect 16SDL Vision for Digital Experience - Arjen van den Akker at SDL Connect 16
SDL Vision for Digital Experience - Arjen van den Akker at SDL Connect 16
 
The Challenge and Opportunity of Website Globalization - Joost Comperen and M...
The Challenge and Opportunity of Website Globalization - Joost Comperen and M...The Challenge and Opportunity of Website Globalization - Joost Comperen and M...
The Challenge and Opportunity of Website Globalization - Joost Comperen and M...
 
SDL's Vision for Globalization - Maxwell Hoffman at SDL Connect 16
SDL's Vision for Globalization - Maxwell Hoffman at SDL Connect 16SDL's Vision for Globalization - Maxwell Hoffman at SDL Connect 16
SDL's Vision for Globalization - Maxwell Hoffman at SDL Connect 16
 
Applying Machine Learning - Abdessamad Echihabi at SDL Connect 16
Applying Machine Learning - Abdessamad Echihabi at SDL Connect 16Applying Machine Learning - Abdessamad Echihabi at SDL Connect 16
Applying Machine Learning - Abdessamad Echihabi at SDL Connect 16
 
Beyond Globalization: Achieving Universal Understanding - Adolfo Hernandez at...
Beyond Globalization: Achieving Universal Understanding - Adolfo Hernandez at...Beyond Globalization: Achieving Universal Understanding - Adolfo Hernandez at...
Beyond Globalization: Achieving Universal Understanding - Adolfo Hernandez at...
 
Video localization: Take Your Videos Global
Video localization: Take Your Videos GlobalVideo localization: Take Your Videos Global
Video localization: Take Your Videos Global
 
Lights, Camera, Translation... Action!
Lights, Camera, Translation... Action!Lights, Camera, Translation... Action!
Lights, Camera, Translation... Action!
 
Transcreation for Deep Cross-Cultural Connection
Transcreation for Deep Cross-Cultural ConnectionTranscreation for Deep Cross-Cultural Connection
Transcreation for Deep Cross-Cultural Connection
 
Panel: Translation Quality Challenges
Panel: Translation Quality ChallengesPanel: Translation Quality Challenges
Panel: Translation Quality Challenges
 
Convergence: How to Bring Together Content Management & Localization to Conq...
Convergence: How to Bring Together Content Management & Localization to Conq...Convergence: How to Bring Together Content Management & Localization to Conq...
Convergence: How to Bring Together Content Management & Localization to Conq...
 
Top Ten Best Practices About Translation Quality Measurement
Top Ten Best Practices About Translation Quality MeasurementTop Ten Best Practices About Translation Quality Measurement
Top Ten Best Practices About Translation Quality Measurement
 
Philips Healthcare: A Case Study. Adoptiong a Test Center Approach to Launch...
Philips Healthcare: A Case Study.  Adoptiong a Test Center Approach to Launch...Philips Healthcare: A Case Study.  Adoptiong a Test Center Approach to Launch...
Philips Healthcare: A Case Study. Adoptiong a Test Center Approach to Launch...
 
SDL Knowledge Center: Advanced Techniques for Rapid Global Content Creation
SDL Knowledge Center:  Advanced Techniques for Rapid Global Content CreationSDL Knowledge Center:  Advanced Techniques for Rapid Global Content Creation
SDL Knowledge Center: Advanced Techniques for Rapid Global Content Creation
 
Multilingual Device & L10n Testing - An Introduction to the SDL Test Lab
Multilingual Device & L10n Testing - An Introduction to the SDL Test LabMultilingual Device & L10n Testing - An Introduction to the SDL Test Lab
Multilingual Device & L10n Testing - An Introduction to the SDL Test Lab
 
How to Extend Your Content Marketing Plan to a Global Audience
How to Extend Your Content Marketing Plan to a Global AudienceHow to Extend Your Content Marketing Plan to a Global Audience
How to Extend Your Content Marketing Plan to a Global Audience
 
Fast and Accurate Preordering for SMT using Neural Networks
Fast and Accurate Preordering for SMT using Neural NetworksFast and Accurate Preordering for SMT using Neural Networks
Fast and Accurate Preordering for SMT using Neural Networks
 
An Arabizi-English Social Media Statistical Machine Translation System
An Arabizi-English Social Media Statistical Machine Translation SystemAn Arabizi-English Social Media Statistical Machine Translation System
An Arabizi-English Social Media Statistical Machine Translation System
 
Redefine Your Global Video Strategy: Video Localization
Redefine Your Global Video Strategy: Video LocalizationRedefine Your Global Video Strategy: Video Localization
Redefine Your Global Video Strategy: Video Localization
 

Último

Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Último (20)

[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 

Machine Translation: Latest Innovations and their Impact on Commercial Translation

  • 1. SDL Proprietary and Confidential Machine Translation: Latest Innovations and their Impact on Commercial Translation SDL Customer Success Summit Montreal Rodrigo Fuentes Corradi, MT Business Consultant June, 2015
  • 2. 2 Agenda ○ Evolution of MT ○ Common MT Use-Cases ○ Engine Training ○ Introducing SDL XMT ○ How to Deploy MT ○ MT and the Post-Editor
  • 4. 4 1950s 2002 2010 2011 2015 SDL acquires RBMT engine…establishes MT group dedicated to improving quality for enterprise applications First SDL Post- Editing projects using SMT go into production Post-Editing booms: 4-fold increase SDL launches PE Certification Program War-time cryptography requirements, with subsequent experiments & investment in automated translation SDL launches XMT next- generation MT platform 2014 Brief history of Machine Translation SDL acquires Language Weaver / BeGlobal Statistical Machine Translation (SMT)
  • 5. 5 Overview: The SDL MT Team Who we are First to commercialize Statistical Machine Translation o 50+ Professionals o Over 10 Nationalities o Across 5 Time Zones o 8 Locations o Ex-translators o Computational Linguists o Project Managers Widespread team of language lovers: o Data Specialists o Post- Editors o Architects …all gathered from the four corners of SDL! What we do Drive MT Adoption: Educate, promote and support MT usage in existing SDL accounts & new opportunities o Design o Create o Test o Implement o Monitor Custom Engine Builds: …custom Statistical Machine Translation engines Linguistic Projects: Semantic annotation projects for US Government bodies & academic institutes How we do it o Los Angeles, CA o Cambridge, UK Two Research Labs: o 100s of Scientific Publications o Over 50 Patents Approved or Filed We’re Evangelists…about Machine Translation, using automation to accelerate productivity
  • 8. 8 Right translation method, right price, right time Quality Volume Human Translation Machine Translation Blogs User Forums Reviews Chat Email Support FAQ Websites Wikis Knowledge Base Alerts/ Notifications Help User Guides Documentation Post-Edit Newsletters Advertising Content Legal
  • 9. 9 Description: ○ Direct access to machine translation from SDL Trados Studio Benefits: ○ Improve the efficiency of translators by providing results of machine translation to them for segments that do not match entries in translation memory Translator productivity
  • 10. 10 Description: ○ Real-time translation of web-based chat conversations Benefits: o Reduces cost of staffing the support/sales operations as they do not need multi-lingual agents o Customer acquisition rates and satisfaction are much higher if you engage the customer in chat. Live chat translation
  • 11. 11 Description: ○ Translation of user-generated content in web-based community forums Benefits: o Enable interactions between customers who speak different languages o Leverage community expertise across languages instead of only within the language of community experts Community forum translation
  • 12. 12 Description: ○ Translation of knowledge base content for local language customers of technical solutions Benefits: o Reduces customer support costs and activity level by allowing remote language customers to directly access solutions o Increases customer satisfaction by providing solutions in their native language Knowledgebase content translation
  • 13. 13 Case study: MT for online customer reviews Requirements: o Share customer reviews with international audiences o Automate the translation of customer reviews into 13 languages Results: o Reduced bounce rate from 70% to 25% o Increased user dwell times and page views o Economically translate 2 billion words/month
  • 14. 14 Case study: MT for instant MS Office translation [a large global retail client] Requirements: o Improve communication among geographically scattered company employees o Fast, low-cost translation of MS Outlook emails & MS Office business documents Results: o BeGlobal Machine Translation integrated via API with MS Office apps o Any employee can instantly translate emails or attachments with a simple double-click
  • 15. 15 Engine training: Making MT smarter Customized engines Domain verticals Baselines
  • 16. 16 Baselines Baselines Data mined from reliable sources available in the public domain, covering various subjects Core generic MT engines for each language pair Work well for general & varied content Can be used as backup for verticals & customized engines Contain hundreds of millions of words of bilingual data 100Ms+
  • 17. 17 Domain verticals Domain verticals Trained statistical engines exclusive for a domain Data selected from sources within a domain or industry MT output more likely to follow technical terminology Solution used when client-specific data is not available or not enough for a customization
  • 18. 18 Customized engines Customized engines Optimize the MT output for specific client projects Training based on client- specific bilingual data More data usually has a positive effect on the MT output Quality & consistency of data is as important as quantity Adherence to client-specific terminology & style
  • 19. 19 How SDL trains an MT engine Training Data Prep & Engine Customization Prep of Testing Material Evaluate MT Output Machine Translation Post-Edit Quality Assessment & Translation Delivery Update Translation Memory Source Content Apply Translation Memory Content Evaluation MT Customization Production QA Refine Training or Deploy for Production Integrate MT on Translation Process SDL MT Server Translation Memory
  • 20. 20 SDL MT Group developers are constantly researching ways to improve Generic, Vertical, and Customized MT Engines SDL Research Scientists are continuously improving the Statistical Machine Translation algorithms (e.g. Language Models, Translation Models, Reordering Models, Syntax, Transliteration, Rule-Based Components, etc…) SDL Data Engineers are continuously mining large amounts of good data used by the statistical algorithms Continuous improvement
  • 21. 21 Introducing SDL XMT… A NEW, modular & flexible technology that will power the “next generation” of SDL MT Syntax-based Machine Translation Phrase-based Machine Translation Word-based Machine Translation 2002 2003 2008 2015 XMT XMT
  • 23. 23 …… Neural Networks Compound Splitting Phrase- Based Finite State Automata String to Tree Rule- Based Tree to String Pre- Ordering Trans- literation Hidden Markov Model Hyper Graphs Modular & Flexible “State-of-the-Art” Machine Learning Better Translation Quality Rapid Research Transition SDL XMT: Next generation technology, higher quality XMT Foreign Language Your Language M O D U L A R C O M P O N E N T S
  • 24. 24 Language Learning in XMT Continuous improvement by learning from Post-Editing. ○ The machine learns how to translate from source to target during the training process ○ The machine does not learn during the translation process Machine Translation Machine Translation + Language Learning ○ The machine learns how to translate from source to target during the training process ○ The machine learns & improves seamlessly, continuously, and in real-time from user feedback during the translation process ○ See it in action: SDL XMT XMT
  • 25. How to Deploy MT Post-Edit
  • 26. 26 Post-Editing experience in Montreal Quality delivered & owned by SDL, therefore commitment to quality remains our number #1 priority ! o Costs reductions up to 40% vs. conventional translation o File Formats received, TXT, XL, and XML o Unique client-specific process developed with collaboration of engineering & IT Teams from SDL & customers SDL Canada Post-Edited Post-Editing Large Retail Customers e-Commerce Sites Post-Edited (Forecasted) 2013 2014 2015 25M 10M 15M Words Words Words 40%
  • 27. 27 Quality in MT Building blocks are there as a lot of content is pulled from the engines Allows the linguist to focus on refining the output Custom engines pull in client terminology & style Fewer resources equals greater consistency Trained linguists well-versed in handling MT output & certified
  • 28. 28 Post-Editing quality requirements When post-editing to publishable quality, the following basic principles still apply: o The same references must be used for as for conventional translation (project- specific guidelines, TMs, glossaries, termbases, etc.) o Grammar, spelling and punctuation must be correct o Appropriate style & correct terminology must be used consistently o The translation must read well and be suitable for its intended purpose Customer User Guide
  • 29. 29 Features to watch out for in SMT output… Incorrect Formatting Additional or Missing words Words Not Localized or Wrong Flavor Gender, Number, Agreement or Verb Inflection Issues Articles & Prepositions Syntax & Word Order Issues Wrong Punctuation Inconsistent or Non-compliant Terminology Mistranslations !
  • 30. 30 Post-Editing Machine Translation certification ○ The demand for MT solutions is growing quickly & Post- Editing is becoming a mainstream skill for translators ○ In response, SDL have created Post-Editing Certification – released in June 2014 ○ 85% of in-house staff completed the Certification in 2014 ○ 2,500+ freelancers signed up for the course ○ The Certification covers the theory behind Machine Translation as well as practical approaches to Post-Editing ○ Our Certification is for anyone impacted by Post-Editing – certified translators can offer an extended skill set JUNE 2014 85% 2,500+
  • 31. 31 SDL iMT: Key steps in the process ○ Evaluate content and translation assets ○ Train MT engines for your content or use existing solution ○ Configure the trained MT engines with SDL’s translation environment (TMS, WS, Studio) ○ Post-edit the MT output to full publishable quality ○ SDL infrastructure to support these steps Evaluate Train MT Configure Post-Edit SDL Infrastructure
  • 32.
  • 33. Copyright © 2008-2015 SDL plc. All rights reserved. All company names, brand names, trademarks, service marks, images and logos are the property of their respective owners. This presentation and its content are SDL confidential unless otherwise specified, and may not be copied, used or distributed except as authorised by SDL. Global Customer Experience Management

Notas del editor

  1. Machine Translation is not the right translation method for all types of content.
  2. Building on over 16 years of Machine Translation leadership, SDL has developed the next generation machine translation platform – SDL XMT While the legacy SDL MT platform was designed as monolithic phrase-based system, SDL XMT has a unique modular design, which allows the rapid development and integration of special-purpose modules that can address specific challenges.
  3. SDL XMT applies different translation algorithms depending on what produces the best translation quality for any given language pair. You will begin to see much higher language quality, particularly for languages such as English to Japanese and Chinese.
  4. See Steve DeNeef’s demo video for Language Learning on this page: http://www.sdl.com/cxc/language/machine-translation/xmt.html