TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflections on the use of AI in translation. By Dieter Rummel (Head of Informatics, DGT European Commission)
Similar a TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflections on the use of AI in translation. By Dieter Rummel (Head of Informatics, DGT European Commission)
Similar a TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflections on the use of AI in translation. By Dieter Rummel (Head of Informatics, DGT European Commission) (20)
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
TAUS Global Content Summit Amsterdam 2019 / Beyond MT. A few premature reflections on the use of AI in translation. By Dieter Rummel (Head of Informatics, DGT European Commission)
1. Beyond MT?
A few premature reflections on the
use of AI in translation
TAUS Global Content Summit Amsterdam, 6 March 2019
Dieter Rummel, EC, Directorate General for Translation
3. Main document types
2015
38
16%
14%
6%
1% 11%
2%
2%
5%
2% 3%
1 EU law, including the legislative process
2 Guardian of the Treaties/Implementation of EU law
3 Correspondence
4 Political documents
5 Relations with other EU institutions
6 Communication, web, media, publications
7 Budget, budgetary procedure
8 Documents linked to international organisations and non-EU countries
9 Notices for publication in OJ
10 Commission working or internal documents
11 Other3
4. Evolution 2012-2018 : Number of translated pages and number of DGT staff
2200
2250
2300
2350
2400
2450
2500
2550
2600
0
500,000
1,000,000
1,500,000
2,000,000
2,500,000
2012 2013 2014 2015 2016 2017 2018
Pages
Staff
5. Context
Long-standing use of language technology + CAT tools
"More (better) with less"
More complexity, new formats, new ways of working
Stronger recourse to outsourcing
Shift from documents to content
Machine Translation as integral part of the resource mix
6. EC
Systran/ECMT
Rule-based MT
Ca. 1976 to 2010
MT@EC
Statistical MT
Moses Decoder
2013 - 2018
eTranslation
Neural MT
Connecting Europe
Facility (CEF)
From 2018
Machine translation at DGT
9. Buzz kill – or why I hate “AI”
• Beware of the images
• Neural MT vs. Recursive hetero-associative memories for translation
• Artificial intelligence is not about intelligence
• Neural networks have little to do with actual neurons
• Big data + neurons + deep learning + magic = Amazing stuff
happens!
• Do we really have big(-ish) data?
• Believe the hype - but in moderation
• Technology is not a solution
• Poor processes don’t get better through AI
• Doing the same and expecting different results = insanity
10. So, this had to be said.
But it’s pretty cool anyway.
• The technology has become accessible.
• “Big data” discussions have shown the possibilities of correlating
data from different sources.
• New ways of transforming data into usable information?
Describe
What is
happening?
Diagnose
Why did it
happen?
Predict
What will
happen?
Decide
What
should I
do?
11. Big data? - Big Questions!
What we translate
• What is the
document/content about?
• Is the document difficult, i.e.
demanding or complex?
• Are we working on
something similar?
• Do we have reliable
resources for this
document?
• How well will MT work for
this document?
Organising work
• How should this content be
best translated?
• Who is most suitable to
translate/revise the
document?
• How should the content be
split between several
translators (=meaningful
clustering)?
• What is our capacity to
translate?
• Are there meaningful
alternatives to the existing
forecasting model?
External service
providers
• How good is the contractor’s
work?
• How confident are we that
they will deliver good
quality?
• How reliable are they?
• Can we correlate
freelancer/agency, history of
evaluations, domain,
document type, document
complexity to calculate a
“reliability indicator” that
could support outsourcing
decisions?
12. More Big Questions!
Quality
• How good is a given translation?
• How good are our language
resources?
• Can we automatically detect
technically and linguistically poor
or suspect?
• How can we learn from mistakes?
Customers
• What are the common issues in
source documents?
• What do they have in common?
• Do we have the linguistic
resources to handle their
documents?
• What are their request patterns?
13. What next?
•Multi-disciplinary
•Explore use cases and
questions
•Break silos
•Validate or reject ideas
and assumptions in a
cost-effective way
•Training (also for
managers!)
•Learn what we do not
know
•Develop skills
•Translation memories
•Terminology
•XLIFF
•“Bad data”
•Missing data
Think about
Data
Create
understanding
and capacity
Incubate!Experiment