SlideShare una empresa de Scribd logo
1 de 12
Descargar para leer sin conexión
Machine translation evaluation
Hermes Traducciones y Servicios Lingüísticos
MT at Hermes
2

 Pure RBMT engines with pre- and post-processing macros.
 Texts from technical domains.
 Applied-technology department has been working for over a
year in MT engines.
 Over 250,000 words post-edited with internal engines in the
last year.
 Average new word count for projects post-edited with internal
engines: 9,000 words.
Our purpose with MT evals
3

Automated metrics might help us:
 predict PE time and productivity gains;
 negotiate reasonable discounts;
 evaluate quality of engines;
 measure performance of applied-technology department;
 not depend on human-reported data.
What we hoped to find
4

 We hoped some metric would correlate with productivity gain
data provided by post-editors.
 We gathered BLEU, F-Measure, METEOR and TER
values.
 Ideally, we would end up relying on automated metrics rather
than time and productivity measurements reported by posteditors.
What we hoped to find
5

120.00

100.00

80.00

60.00

40.00

20.00

0.00
0.00

20.00

40.00

60.00

Productivity gain %

80.00

100.00

120.00
What we hoped to find
6

120.00

100.00

80.00

60.00

40.00

20.00

0.00
0.00

20.00

40.00

60.00

Productivity gain %

80.00

100.00

120.00
What we actually found: No correlation
7
100.00
90.00
80.00
70.00
60.00
BLEU

50.00

F-Measure
TER

40.00

METEOR
30.00
20.00
10.00
0.00
0.00

20.00

40.00

60.00

80.00

100.00

Productivity gain %

120.00

140.00

160.00
What we actually found: No correlation
8
100.00
90.00
80.00
70.00
60.00
BLEU

50.00

F-Measure
TER

40.00

METEOR
30.00
20.00
10.00
0.00
0.00

20.00

40.00

60.00

80.00

100.00

Productivity gain %

120.00

140.00

160.00
Reasons for the variability
9

 Different CAT environments (Trados Studio, memoQ,
Idiom, TagEditor, etc.).
 Different engines (per domain, per client, etc.).
 Different clients, different needs.
 Different post-editors.
 Or, if same post-editor, different post-editing skills over time.

 Different word volumes.
 Specific productivity or consistency-enhancement
processing can affect metrics negatively.
Productivity-enhancement example
10

 Source: Add events as described in Adding Events to a Model.
 PE: Agregue los eventos como se describe en Adición de eventos a un
modelo.
 Raw 1: Agregue los eventos como se describe en la adición de los eventos a
un modelo.
 Raw 2: Agregue los eventos como se describe en Adding Events to a Model.
 Scores:
Raw 1 Raw 2
 BLEU
 TER

68,59
17,65

53,33
29,41

Metrics for Raw 1 are significantly
better, but Raw 2 is faster to post-edit
thanks to automatic terminology
insertion tools (such as Xbench).
Human evaluation
11

 Adequacy: How much of the meaning expressed in the goldstandard translation or the source is also expressed in the target
translation?





4. Everything
3. Most
2. Little
1. None

 Fluency: To what extent is a target side translation grammatically
well informed, without spelling errors and experienced as using
natural/intuitive language by a native speaker?





4. Flawless
3. Good
2. Dis-fluent
1. Incomprehensible
Source: TAUS MT evaluation guidelines
https://evaluation.taus.net/resources/adequacy-fluency-guidelines
Conclusions
12

 We combine automated metrics with time/productivity data reported
by post-editor for final evaluation of internal MT performance.
 Poor post-editing skills or any project-specific contingency can be
counter-balanced with good automated metrics.
 We look for qualitative information in automated metrics, not
quantitative.
 BLEU values of 65 and 70 for two different engines tell us both
are good engines, not that one will render 5% better results than
the other.

Más contenido relacionado

La actualidad más candente

Overview of Multidimensional Quality Metrics (QTLaunchPad)
Overview of Multidimensional Quality Metrics (QTLaunchPad)Overview of Multidimensional Quality Metrics (QTLaunchPad)
Overview of Multidimensional Quality Metrics (QTLaunchPad)Arle Lommel
 
Defining Translation Quality in ASTM
Defining Translation Quality in ASTMDefining Translation Quality in ASTM
Defining Translation Quality in ASTMSerge Gladkoff
 
TAUS Quality Dashboard and the integration of DQF in translation technologies...
TAUS Quality Dashboard and the integration of DQF in translation technologies...TAUS Quality Dashboard and the integration of DQF in translation technologies...
TAUS Quality Dashboard and the integration of DQF in translation technologies...TAUS - The Language Data Network
 
High Volume, Rapid Turn Around Localization: Lessons Learned
High Volume, Rapid Turn Around Localization: Lessons LearnedHigh Volume, Rapid Turn Around Localization: Lessons Learned
High Volume, Rapid Turn Around Localization: Lessons LearnedSDL
 
Top Trans Survey Translation Issues
Top Trans Survey Translation IssuesTop Trans Survey Translation Issues
Top Trans Survey Translation IssuesRaya Wasser
 
The Latest Advances in Patent Machine Translation
The Latest Advances in Patent Machine TranslationThe Latest Advances in Patent Machine Translation
The Latest Advances in Patent Machine TranslationIconic Translation Machines
 
How Technology Has Changed the World of Technical Translation
How Technology Has Changed the World of Technical TranslationHow Technology Has Changed the World of Technical Translation
How Technology Has Changed the World of Technical TranslationTennycut
 
Technical_translation_is_it_really_about_terminology_en
Technical_translation_is_it_really_about_terminology_enTechnical_translation_is_it_really_about_terminology_en
Technical_translation_is_it_really_about_terminology_enVyacheslav Guzovsky
 
Keys to successful technical translation
Keys to successful technical translationKeys to successful technical translation
Keys to successful technical translationTrue Language
 
Good Applications of Bad Machine Translation
Good Applications of Bad Machine TranslationGood Applications of Bad Machine Translation
Good Applications of Bad Machine Translationbdonaldson
 
Language translator internship report
Language translator internship reportLanguage translator internship report
Language translator internship reportSumitSumit26
 
Technical translation (1)
Technical translation (1)Technical translation (1)
Technical translation (1)Brian Cannon
 
Panel: Translation Quality Challenges
Panel: Translation Quality ChallengesPanel: Translation Quality Challenges
Panel: Translation Quality ChallengesSDL
 
2. Project Management - Alexandre Helle & Manuel Herranz (Pangeanic)
2. Project Management - Alexandre Helle & Manuel Herranz (Pangeanic)2. Project Management - Alexandre Helle & Manuel Herranz (Pangeanic)
2. Project Management - Alexandre Helle & Manuel Herranz (Pangeanic)RIILP
 

La actualidad más candente (20)

Overview of Multidimensional Quality Metrics (QTLaunchPad)
Overview of Multidimensional Quality Metrics (QTLaunchPad)Overview of Multidimensional Quality Metrics (QTLaunchPad)
Overview of Multidimensional Quality Metrics (QTLaunchPad)
 
Defining Translation Quality in ASTM
Defining Translation Quality in ASTMDefining Translation Quality in ASTM
Defining Translation Quality in ASTM
 
TAUS Quality Dashboard and the integration of DQF in translation technologies...
TAUS Quality Dashboard and the integration of DQF in translation technologies...TAUS Quality Dashboard and the integration of DQF in translation technologies...
TAUS Quality Dashboard and the integration of DQF in translation technologies...
 
High Volume, Rapid Turn Around Localization: Lessons Learned
High Volume, Rapid Turn Around Localization: Lessons LearnedHigh Volume, Rapid Turn Around Localization: Lessons Learned
High Volume, Rapid Turn Around Localization: Lessons Learned
 
Top Trans Survey Translation Issues
Top Trans Survey Translation IssuesTop Trans Survey Translation Issues
Top Trans Survey Translation Issues
 
The Latest Advances in Patent Machine Translation
The Latest Advances in Patent Machine TranslationThe Latest Advances in Patent Machine Translation
The Latest Advances in Patent Machine Translation
 
How Technology Has Changed the World of Technical Translation
How Technology Has Changed the World of Technical TranslationHow Technology Has Changed the World of Technical Translation
How Technology Has Changed the World of Technical Translation
 
Technical_translation_is_it_really_about_terminology_en
Technical_translation_is_it_really_about_terminology_enTechnical_translation_is_it_really_about_terminology_en
Technical_translation_is_it_really_about_terminology_en
 
Keys to successful technical translation
Keys to successful technical translationKeys to successful technical translation
Keys to successful technical translation
 
Technical Translation
Technical TranslationTechnical Translation
Technical Translation
 
Back translation explained: what we do and what you get
Back translation explained: what we do and what you getBack translation explained: what we do and what you get
Back translation explained: what we do and what you get
 
Good Applications of Bad Machine Translation
Good Applications of Bad Machine TranslationGood Applications of Bad Machine Translation
Good Applications of Bad Machine Translation
 
Steps in translation process
Steps in translation processSteps in translation process
Steps in translation process
 
Language translator internship report
Language translator internship reportLanguage translator internship report
Language translator internship report
 
The 3 types of translation review – and when to use them
The 3 types of translation review – and when to use themThe 3 types of translation review – and when to use them
The 3 types of translation review – and when to use them
 
MT Use in Lingosail, by Yongpeng Wei, Lingosail
MT Use in Lingosail, by Yongpeng Wei, LingosailMT Use in Lingosail, by Yongpeng Wei, Lingosail
MT Use in Lingosail, by Yongpeng Wei, Lingosail
 
Technical translation (1)
Technical translation (1)Technical translation (1)
Technical translation (1)
 
Panel: Translation Quality Challenges
Panel: Translation Quality ChallengesPanel: Translation Quality Challenges
Panel: Translation Quality Challenges
 
2. Project Management - Alexandre Helle & Manuel Herranz (Pangeanic)
2. Project Management - Alexandre Helle & Manuel Herranz (Pangeanic)2. Project Management - Alexandre Helle & Manuel Herranz (Pangeanic)
2. Project Management - Alexandre Helle & Manuel Herranz (Pangeanic)
 
Insights in the MT Market, by Jaap van der Meer, TAUS
Insights in the MT Market, by Jaap van der Meer, TAUSInsights in the MT Market, by Jaap van der Meer, TAUS
Insights in the MT Market, by Jaap van der Meer, TAUS
 

Destacado

3. Natalia Konstantinova (UoW) EXPERT Introduction
3. Natalia Konstantinova (UoW) EXPERT Introduction3. Natalia Konstantinova (UoW) EXPERT Introduction
3. Natalia Konstantinova (UoW) EXPERT IntroductionRIILP
 
9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation
9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation
9. Manuel Harranz (pangeanic) Hybrid Solutions for TranslationRIILP
 
1. EXPERT Winter School Partner Introductions
1. EXPERT Winter School Partner Introductions1. EXPERT Winter School Partner Introductions
1. EXPERT Winter School Partner IntroductionsRIILP
 
5. manuel arcedillo & juanjo arevalillo (hermes) translation memories
5. manuel arcedillo & juanjo arevalillo (hermes) translation memories5. manuel arcedillo & juanjo arevalillo (hermes) translation memories
5. manuel arcedillo & juanjo arevalillo (hermes) translation memoriesRIILP
 
8. Qun Liu (DCU) Hybrid Solutions for Translation
8. Qun Liu (DCU) Hybrid Solutions for Translation8. Qun Liu (DCU) Hybrid Solutions for Translation
8. Qun Liu (DCU) Hybrid Solutions for TranslationRIILP
 
17. Anne Schuman (USAAR) Terminology and Ontologies 2
17. Anne Schuman (USAAR) Terminology and Ontologies 217. Anne Schuman (USAAR) Terminology and Ontologies 2
17. Anne Schuman (USAAR) Terminology and Ontologies 2RIILP
 
16. Anne Schumann (USAAR) Terminology and Ontologies 1
16. Anne Schumann (USAAR) Terminology and Ontologies 116. Anne Schumann (USAAR) Terminology and Ontologies 1
16. Anne Schumann (USAAR) Terminology and Ontologies 1RIILP
 
14. Michael Oakes (UoW) Natural Language Processing for Translation
14. Michael Oakes (UoW) Natural Language Processing for Translation14. Michael Oakes (UoW) Natural Language Processing for Translation
14. Michael Oakes (UoW) Natural Language Processing for TranslationRIILP
 
7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine TranslationRIILP
 
2. Constantin Orasan (UoW) EXPERT Introduction
2. Constantin Orasan (UoW) EXPERT Introduction2. Constantin Orasan (UoW) EXPERT Introduction
2. Constantin Orasan (UoW) EXPERT IntroductionRIILP
 
4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...
4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...
4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...RIILP
 
6. Khalil Sima'an (UVA) Statistical Machine Translation
6. Khalil Sima'an (UVA) Statistical Machine Translation6. Khalil Sima'an (UVA) Statistical Machine Translation
6. Khalil Sima'an (UVA) Statistical Machine TranslationRIILP
 
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...RIILP
 
13. Constantin Orasan (UoW) Natural Language Processing for Translation
13. Constantin Orasan (UoW) Natural Language Processing for Translation13. Constantin Orasan (UoW) Natural Language Processing for Translation
13. Constantin Orasan (UoW) Natural Language Processing for TranslationRIILP
 

Destacado (14)

3. Natalia Konstantinova (UoW) EXPERT Introduction
3. Natalia Konstantinova (UoW) EXPERT Introduction3. Natalia Konstantinova (UoW) EXPERT Introduction
3. Natalia Konstantinova (UoW) EXPERT Introduction
 
9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation
9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation
9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation
 
1. EXPERT Winter School Partner Introductions
1. EXPERT Winter School Partner Introductions1. EXPERT Winter School Partner Introductions
1. EXPERT Winter School Partner Introductions
 
5. manuel arcedillo & juanjo arevalillo (hermes) translation memories
5. manuel arcedillo & juanjo arevalillo (hermes) translation memories5. manuel arcedillo & juanjo arevalillo (hermes) translation memories
5. manuel arcedillo & juanjo arevalillo (hermes) translation memories
 
8. Qun Liu (DCU) Hybrid Solutions for Translation
8. Qun Liu (DCU) Hybrid Solutions for Translation8. Qun Liu (DCU) Hybrid Solutions for Translation
8. Qun Liu (DCU) Hybrid Solutions for Translation
 
17. Anne Schuman (USAAR) Terminology and Ontologies 2
17. Anne Schuman (USAAR) Terminology and Ontologies 217. Anne Schuman (USAAR) Terminology and Ontologies 2
17. Anne Schuman (USAAR) Terminology and Ontologies 2
 
16. Anne Schumann (USAAR) Terminology and Ontologies 1
16. Anne Schumann (USAAR) Terminology and Ontologies 116. Anne Schumann (USAAR) Terminology and Ontologies 1
16. Anne Schumann (USAAR) Terminology and Ontologies 1
 
14. Michael Oakes (UoW) Natural Language Processing for Translation
14. Michael Oakes (UoW) Natural Language Processing for Translation14. Michael Oakes (UoW) Natural Language Processing for Translation
14. Michael Oakes (UoW) Natural Language Processing for Translation
 
7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation7. Trevor Cohn (usfd) Statistical Machine Translation
7. Trevor Cohn (usfd) Statistical Machine Translation
 
2. Constantin Orasan (UoW) EXPERT Introduction
2. Constantin Orasan (UoW) EXPERT Introduction2. Constantin Orasan (UoW) EXPERT Introduction
2. Constantin Orasan (UoW) EXPERT Introduction
 
4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...
4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...
4. Josef Van Genabith (DCU) & Khalil Sima'an (UVA) Example Based Machine Tran...
 
6. Khalil Sima'an (UVA) Statistical Machine Translation
6. Khalil Sima'an (UVA) Statistical Machine Translation6. Khalil Sima'an (UVA) Statistical Machine Translation
6. Khalil Sima'an (UVA) Statistical Machine Translation
 
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...
12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Tran...
 
13. Constantin Orasan (UoW) Natural Language Processing for Translation
13. Constantin Orasan (UoW) Natural Language Processing for Translation13. Constantin Orasan (UoW) Natural Language Processing for Translation
13. Constantin Orasan (UoW) Natural Language Processing for Translation
 

Similar a Machine translation evaluation metrics provide no correlation with post-editor productivity gains

Tech capabilities with_sa
Tech capabilities with_saTech capabilities with_sa
Tech capabilities with_saRobert Martin
 
Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...SDL
 
Seeing the Wood for the Trees in MT Evaluation: an LSP success story from RWS
Seeing the Wood for the Trees in MT Evaluation: an LSP success story from RWSSeeing the Wood for the Trees in MT Evaluation: an LSP success story from RWS
Seeing the Wood for the Trees in MT Evaluation: an LSP success story from RWSIconic Translation Machines
 
Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize Throughputs and Post-Editing Productivity Webinar Laura CasanellasWelocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize Throughputs and Post-Editing Productivity Webinar Laura CasanellasWelocalize
 
Evaluation of MT Quality/Productivity at eBay - AMTA 2018
Evaluation of MT Quality/Productivity at eBay - AMTA 2018Evaluation of MT Quality/Productivity at eBay - AMTA 2018
Evaluation of MT Quality/Productivity at eBay - AMTA 2018Jose Luis Bonilla Sánchez
 
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L MargMT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L MargWelocalize
 
TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...
TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...
TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...TAUS - The Language Data Network
 
State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)Konstantin Savenkov
 
Carla Parra Escartin - ER2 Hermes Traducciones
Carla Parra Escartin - ER2 Hermes Traducciones Carla Parra Escartin - ER2 Hermes Traducciones
Carla Parra Escartin - ER2 Hermes Traducciones RIILP
 
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...Welocalize
 
Amta 2012-federico (1)
Amta 2012-federico (1)Amta 2012-federico (1)
Amta 2012-federico (1)FabiolaPanetti
 
Quality is in the Eye of the Beholder - Part 2
Quality is in the Eye of the Beholder - Part 2Quality is in the Eye of the Beholder - Part 2
Quality is in the Eye of the Beholder - Part 2Think Latin America
 
Building a pan-European automated translation platform, Andrejs Vasiljevs, CE...
Building a pan-European automated translation platform, Andrejs Vasiljevs, CE...Building a pan-European automated translation platform, Andrejs Vasiljevs, CE...
Building a pan-European automated translation platform, Andrejs Vasiljevs, CE...TAUS - The Language Data Network
 
What machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happyWhat machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happyIconic Translation Machines
 
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
 HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio... HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...Lifeng (Aaron) Han
 

Similar a Machine translation evaluation metrics provide no correlation with post-editor productivity gains (20)

Tech capabilities with_sa
Tech capabilities with_saTech capabilities with_sa
Tech capabilities with_sa
 
Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...Learn the different approaches to machine translation and how to improve the ...
Learn the different approaches to machine translation and how to improve the ...
 
Seeing the Wood for the Trees in MT Evaluation: an LSP success story from RWS
Seeing the Wood for the Trees in MT Evaluation: an LSP success story from RWSSeeing the Wood for the Trees in MT Evaluation: an LSP success story from RWS
Seeing the Wood for the Trees in MT Evaluation: an LSP success story from RWS
 
Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize Throughputs and Post-Editing Productivity Webinar Laura CasanellasWelocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
Welocalize Throughputs and Post-Editing Productivity Webinar Laura Casanellas
 
Evaluation of MT Quality/Productivity at eBay - AMTA 2018
Evaluation of MT Quality/Productivity at eBay - AMTA 2018Evaluation of MT Quality/Productivity at eBay - AMTA 2018
Evaluation of MT Quality/Productivity at eBay - AMTA 2018
 
TAUS QE Summit 2017 eBay EN-DE MT Pilot
TAUS QE Summit 2017   eBay EN-DE MT PilotTAUS QE Summit 2017   eBay EN-DE MT Pilot
TAUS QE Summit 2017 eBay EN-DE MT Pilot
 
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L MargMT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
MT Summit 2013 Welocalize Getting the MT Recipe Right by L Casanellas and L Marg
 
TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...
TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...
TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2...
 
State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)State of the Machine Translation by Intento (November 2017)
State of the Machine Translation by Intento (November 2017)
 
Carla Parra Escartin - ER2 Hermes Traducciones
Carla Parra Escartin - ER2 Hermes Traducciones Carla Parra Escartin - ER2 Hermes Traducciones
Carla Parra Escartin - ER2 Hermes Traducciones
 
Ch26
Ch26Ch26
Ch26
 
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
WeMT Tools and Processes Welocalize TAUS Showcase October 2013 Localization W...
 
Amta 2012-federico (1)
Amta 2012-federico (1)Amta 2012-federico (1)
Amta 2012-federico (1)
 
SE_Unit 2.pptx
SE_Unit 2.pptxSE_Unit 2.pptx
SE_Unit 2.pptx
 
Quality is in the Eye of the Beholder, by Eva Klaudinyova
Quality is in the Eye of the Beholder, by Eva KlaudinyovaQuality is in the Eye of the Beholder, by Eva Klaudinyova
Quality is in the Eye of the Beholder, by Eva Klaudinyova
 
Quality is in the Eye of the Beholder - Part 2
Quality is in the Eye of the Beholder - Part 2Quality is in the Eye of the Beholder - Part 2
Quality is in the Eye of the Beholder - Part 2
 
Building a pan-European automated translation platform, Andrejs Vasiljevs, CE...
Building a pan-European automated translation platform, Andrejs Vasiljevs, CE...Building a pan-European automated translation platform, Andrejs Vasiljevs, CE...
Building a pan-European automated translation platform, Andrejs Vasiljevs, CE...
 
What machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happyWhat machine translation developers are doing to make post-editors happy
What machine translation developers are doing to make post-editors happy
 
Software tools
Software toolsSoftware tools
Software tools
 
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
 HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio... HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
HOPE: A Task-Oriented and Human-Centric Evaluation Framework Using Professio...
 

Más de RIILP

Gabriella Gonzalez - eTRAD
Gabriella Gonzalez - eTRAD Gabriella Gonzalez - eTRAD
Gabriella Gonzalez - eTRAD RIILP
 
Manuel Herranz - Pangeanic
Manuel Herranz - Pangeanic Manuel Herranz - Pangeanic
Manuel Herranz - Pangeanic RIILP
 
Juanjo Arevelillo - Hermes Traducciones
Juanjo Arevelillo - Hermes Traducciones Juanjo Arevelillo - Hermes Traducciones
Juanjo Arevelillo - Hermes Traducciones RIILP
 
Gianluca Giulinin - FAO
Gianluca Giulinin - FAO Gianluca Giulinin - FAO
Gianluca Giulinin - FAO RIILP
 
Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic
Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic
Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic RIILP
 
Tony O'Dowd - KantanMT
Tony O'Dowd -  KantanMT Tony O'Dowd -  KantanMT
Tony O'Dowd - KantanMT RIILP
 
Santanu Pal - ESR 2 USAAR
Santanu Pal - ESR 2 USAARSantanu Pal - ESR 2 USAAR
Santanu Pal - ESR 2 USAARRIILP
 
Chris Hokamp - ESR 9 DCU
Chris Hokamp - ESR 9 DCU Chris Hokamp - ESR 9 DCU
Chris Hokamp - ESR 9 DCU RIILP
 
Anna Zaretskaya - ESR 1 UMA
Anna Zaretskaya - ESR 1 UMAAnna Zaretskaya - ESR 1 UMA
Anna Zaretskaya - ESR 1 UMARIILP
 
Carolina Scarton - ESR 7 - USFD
Carolina Scarton - ESR 7 - USFD  Carolina Scarton - ESR 7 - USFD
Carolina Scarton - ESR 7 - USFD RIILP
 
Rohit Gupta - ESR 4 - UoW
Rohit Gupta - ESR 4 - UoW Rohit Gupta - ESR 4 - UoW
Rohit Gupta - ESR 4 - UoW RIILP
 
Hernani Costa - ESR 3 - UMA
Hernani Costa - ESR 3 - UMA Hernani Costa - ESR 3 - UMA
Hernani Costa - ESR 3 - UMA RIILP
 
Liangyou Li - ESR 8 - DCU
Liangyou Li - ESR 8 - DCU Liangyou Li - ESR 8 - DCU
Liangyou Li - ESR 8 - DCU RIILP
 
Liling Tan - ESR 5 USAAR
Liling Tan - ESR 5 USAARLiling Tan - ESR 5 USAAR
Liling Tan - ESR 5 USAARRIILP
 
Sandra de luca - Acclaro
Sandra de luca - AcclaroSandra de luca - Acclaro
Sandra de luca - AcclaroRIILP
 
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015RIILP
 
ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015
ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015
ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015RIILP
 
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015RIILP
 
ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015
ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015
ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015RIILP
 
ESR4 Rohit Gupta - EXPERT Summer School - Malaga 2015
ESR4 Rohit Gupta - EXPERT Summer School - Malaga 2015ESR4 Rohit Gupta - EXPERT Summer School - Malaga 2015
ESR4 Rohit Gupta - EXPERT Summer School - Malaga 2015RIILP
 

Más de RIILP (20)

Gabriella Gonzalez - eTRAD
Gabriella Gonzalez - eTRAD Gabriella Gonzalez - eTRAD
Gabriella Gonzalez - eTRAD
 
Manuel Herranz - Pangeanic
Manuel Herranz - Pangeanic Manuel Herranz - Pangeanic
Manuel Herranz - Pangeanic
 
Juanjo Arevelillo - Hermes Traducciones
Juanjo Arevelillo - Hermes Traducciones Juanjo Arevelillo - Hermes Traducciones
Juanjo Arevelillo - Hermes Traducciones
 
Gianluca Giulinin - FAO
Gianluca Giulinin - FAO Gianluca Giulinin - FAO
Gianluca Giulinin - FAO
 
Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic
Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic
Lianet Sepulveda & Alexander Raginsky - ER 3a & ER 3b Pangeanic
 
Tony O'Dowd - KantanMT
Tony O'Dowd -  KantanMT Tony O'Dowd -  KantanMT
Tony O'Dowd - KantanMT
 
Santanu Pal - ESR 2 USAAR
Santanu Pal - ESR 2 USAARSantanu Pal - ESR 2 USAAR
Santanu Pal - ESR 2 USAAR
 
Chris Hokamp - ESR 9 DCU
Chris Hokamp - ESR 9 DCU Chris Hokamp - ESR 9 DCU
Chris Hokamp - ESR 9 DCU
 
Anna Zaretskaya - ESR 1 UMA
Anna Zaretskaya - ESR 1 UMAAnna Zaretskaya - ESR 1 UMA
Anna Zaretskaya - ESR 1 UMA
 
Carolina Scarton - ESR 7 - USFD
Carolina Scarton - ESR 7 - USFD  Carolina Scarton - ESR 7 - USFD
Carolina Scarton - ESR 7 - USFD
 
Rohit Gupta - ESR 4 - UoW
Rohit Gupta - ESR 4 - UoW Rohit Gupta - ESR 4 - UoW
Rohit Gupta - ESR 4 - UoW
 
Hernani Costa - ESR 3 - UMA
Hernani Costa - ESR 3 - UMA Hernani Costa - ESR 3 - UMA
Hernani Costa - ESR 3 - UMA
 
Liangyou Li - ESR 8 - DCU
Liangyou Li - ESR 8 - DCU Liangyou Li - ESR 8 - DCU
Liangyou Li - ESR 8 - DCU
 
Liling Tan - ESR 5 USAAR
Liling Tan - ESR 5 USAARLiling Tan - ESR 5 USAAR
Liling Tan - ESR 5 USAAR
 
Sandra de luca - Acclaro
Sandra de luca - AcclaroSandra de luca - Acclaro
Sandra de luca - Acclaro
 
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
ER1 Eduard Barbu - EXPERT Summer School - Malaga 2015
 
ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015
ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015
ESR1 Anna Zaretskaya - EXPERT Summer School - Malaga 2015
 
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
ESR2 Santanu Pal - EXPERT Summer School - Malaga 2015
 
ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015
ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015
ESR3 Hernani Costa - EXPERT Summer School - Malaga 2015
 
ESR4 Rohit Gupta - EXPERT Summer School - Malaga 2015
ESR4 Rohit Gupta - EXPERT Summer School - Malaga 2015ESR4 Rohit Gupta - EXPERT Summer School - Malaga 2015
ESR4 Rohit Gupta - EXPERT Summer School - Malaga 2015
 

Último

Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 

Último (20)

Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 

Machine translation evaluation metrics provide no correlation with post-editor productivity gains

  • 1. Machine translation evaluation Hermes Traducciones y Servicios Lingüísticos
  • 2. MT at Hermes 2  Pure RBMT engines with pre- and post-processing macros.  Texts from technical domains.  Applied-technology department has been working for over a year in MT engines.  Over 250,000 words post-edited with internal engines in the last year.  Average new word count for projects post-edited with internal engines: 9,000 words.
  • 3. Our purpose with MT evals 3 Automated metrics might help us:  predict PE time and productivity gains;  negotiate reasonable discounts;  evaluate quality of engines;  measure performance of applied-technology department;  not depend on human-reported data.
  • 4. What we hoped to find 4  We hoped some metric would correlate with productivity gain data provided by post-editors.  We gathered BLEU, F-Measure, METEOR and TER values.  Ideally, we would end up relying on automated metrics rather than time and productivity measurements reported by posteditors.
  • 5. What we hoped to find 5 120.00 100.00 80.00 60.00 40.00 20.00 0.00 0.00 20.00 40.00 60.00 Productivity gain % 80.00 100.00 120.00
  • 6. What we hoped to find 6 120.00 100.00 80.00 60.00 40.00 20.00 0.00 0.00 20.00 40.00 60.00 Productivity gain % 80.00 100.00 120.00
  • 7. What we actually found: No correlation 7 100.00 90.00 80.00 70.00 60.00 BLEU 50.00 F-Measure TER 40.00 METEOR 30.00 20.00 10.00 0.00 0.00 20.00 40.00 60.00 80.00 100.00 Productivity gain % 120.00 140.00 160.00
  • 8. What we actually found: No correlation 8 100.00 90.00 80.00 70.00 60.00 BLEU 50.00 F-Measure TER 40.00 METEOR 30.00 20.00 10.00 0.00 0.00 20.00 40.00 60.00 80.00 100.00 Productivity gain % 120.00 140.00 160.00
  • 9. Reasons for the variability 9  Different CAT environments (Trados Studio, memoQ, Idiom, TagEditor, etc.).  Different engines (per domain, per client, etc.).  Different clients, different needs.  Different post-editors.  Or, if same post-editor, different post-editing skills over time.  Different word volumes.  Specific productivity or consistency-enhancement processing can affect metrics negatively.
  • 10. Productivity-enhancement example 10  Source: Add events as described in Adding Events to a Model.  PE: Agregue los eventos como se describe en Adición de eventos a un modelo.  Raw 1: Agregue los eventos como se describe en la adición de los eventos a un modelo.  Raw 2: Agregue los eventos como se describe en Adding Events to a Model.  Scores: Raw 1 Raw 2  BLEU  TER 68,59 17,65 53,33 29,41 Metrics for Raw 1 are significantly better, but Raw 2 is faster to post-edit thanks to automatic terminology insertion tools (such as Xbench).
  • 11. Human evaluation 11  Adequacy: How much of the meaning expressed in the goldstandard translation or the source is also expressed in the target translation?     4. Everything 3. Most 2. Little 1. None  Fluency: To what extent is a target side translation grammatically well informed, without spelling errors and experienced as using natural/intuitive language by a native speaker?     4. Flawless 3. Good 2. Dis-fluent 1. Incomprehensible Source: TAUS MT evaluation guidelines https://evaluation.taus.net/resources/adequacy-fluency-guidelines
  • 12. Conclusions 12  We combine automated metrics with time/productivity data reported by post-editor for final evaluation of internal MT performance.  Poor post-editing skills or any project-specific contingency can be counter-balanced with good automated metrics.  We look for qualitative information in automated metrics, not quantitative.  BLEU values of 65 and 70 for two different engines tell us both are good engines, not that one will render 5% better results than the other.