SlideShare a Scribd company logo
1 of 17
MACHINE TRANSLATION
QUALITY ESTIMATION
A Linguist’s Approach
WHAT IS MT QUALITY ESTIMATION?
Automatically providing a quality indicator for machine
translation output without depending on human reference
translations.
Our objective:
Estimate quality and post-editing effort for eBay listing titles
and descriptions
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 2
ONE big CHALLENGE
min W Ʃ T
t=1 ||(W(t)X(t) − Y (t) )||2 2 + λs||S||1 + λb||B||1,∞ subject to: W = S + B
or
“State-of-the-art QE explores different supervised linear or non-linear learning methods
for regression or classification such as Support Vector Machines (SVM), different types
of Decision Trees, Neural Networks, Elastic-Net, Gaussian Processes, Naive Bayes,
among others”
(Machine Translation Quality Estimation Across Domains, de Souza et al, 2014)
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 3
A LINGUIST’S APPROACH
Using linguistic features from 3 dimensions:
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 4
COMPLEXITY ADEQUACY FLUENCY
FEATURES
Complexity:
• Length
• Polysemy
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 5
Adequacy:
• QA
 Terminology
 Patterns
 Blacklist
 Numbers
• Automated
Post-Editing
• (POS)
• (NER)
Fluency:
• Misspellings
• Grammar errors
IMPLEMENTATION
Checkmate+LanguageTool
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 6
Reusable Profile
Detailed Report
Score
TESTING
• One Language (es-LA)
• Short samples (~300 words)
• Bigger samples (~1000 words)
• Post-Edited files (~50,000 words)
• pt-BR, ru-RU, zh-CN
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 7
RESULTS
MEASURING RESULTS
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 9
SAMPLES - SCORE AND TIME ALIGN
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 10
FILES - SCORE AND ED ALIGN
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 11
Average ED (es-LA, descriptions) = 72
MT QE OVER TIME
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 12
SAMPLES - OTHER LANGUAGES
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 13
CHALLENGES
• False positives
• Matching score and post-editing effort
• Same weight for all features
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 14
WHAT’S NEXT
• Tracking scores over time
• Adding scores to our post-editing tool
• Adding new languages
• Researching new features
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 15
HOW CAN YOU USE THIS?
• Tailor the model to your needs
• Estimate quality at the file/segment level
• Target post-editing, discard bad content
• Estimate post-editing effort/time
• Compare MT systems
• Monitor MT system progress
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 16
Q&A
THANK YOU! jrowda@ebay.com
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 17

More Related Content

Viewers also liked

A Civilização Industrial no século XIX
A Civilização Industrial no século XIXA Civilização Industrial no século XIX
A Civilização Industrial no século XIXNuno Eusébio
 
Homelluna.pdf
Homelluna.pdfHomelluna.pdf
Homelluna.pdfAgupan0
 
Cortese e Gentile
Cortese e Gentile Cortese e Gentile
Cortese e Gentile LilliStab
 
Daniela barragan
Daniela barraganDaniela barragan
Daniela barraganacrosinus
 
London TEC Construction Sector Study
London TEC Construction Sector StudyLondon TEC Construction Sector Study
London TEC Construction Sector StudyMike Townsend
 
Private Is The New Public: How Society Is Ending Privacy
Private Is The New Public: How Society Is Ending PrivacyPrivate Is The New Public: How Society Is Ending Privacy
Private Is The New Public: How Society Is Ending PrivacyLin Ke
 
De toan a_b_d_2002_2012
De toan a_b_d_2002_2012De toan a_b_d_2002_2012
De toan a_b_d_2002_2012Huynh ICT
 
BF CV 12-17-16
BF CV 12-17-16BF CV 12-17-16
BF CV 12-17-16Bruce Fox
 
كتابة الأخبار القصيرة للمحمول SMS
كتابة الأخبار القصيرة للمحمول SMSكتابة الأخبار القصيرة للمحمول SMS
كتابة الأخبار القصيرة للمحمول SMSOmar Mostafa
 
Q1 2012 investor_presentation_may_2012
Q1 2012  investor_presentation_may_2012Q1 2012  investor_presentation_may_2012
Q1 2012 investor_presentation_may_2012ramram0
 
про нформац ю
про  нформац юпро  нформац ю
про нформац юPsariova
 
Origen i evolució de l'univers
Origen i evolució de l'universOrigen i evolució de l'univers
Origen i evolució de l'universslapafrasla
 

Viewers also liked (15)

Future of india
Future of indiaFuture of india
Future of india
 
A Civilização Industrial no século XIX
A Civilização Industrial no século XIXA Civilização Industrial no século XIX
A Civilização Industrial no século XIX
 
Homelluna.pdf
Homelluna.pdfHomelluna.pdf
Homelluna.pdf
 
Cortese e Gentile
Cortese e Gentile Cortese e Gentile
Cortese e Gentile
 
Daniela barragan
Daniela barraganDaniela barragan
Daniela barragan
 
Alternativas para el consumidor
Alternativas para el consumidorAlternativas para el consumidor
Alternativas para el consumidor
 
London TEC Construction Sector Study
London TEC Construction Sector StudyLondon TEC Construction Sector Study
London TEC Construction Sector Study
 
Private Is The New Public: How Society Is Ending Privacy
Private Is The New Public: How Society Is Ending PrivacyPrivate Is The New Public: How Society Is Ending Privacy
Private Is The New Public: How Society Is Ending Privacy
 
De toan a_b_d_2002_2012
De toan a_b_d_2002_2012De toan a_b_d_2002_2012
De toan a_b_d_2002_2012
 
BF CV 12-17-16
BF CV 12-17-16BF CV 12-17-16
BF CV 12-17-16
 
كتابة الأخبار القصيرة للمحمول SMS
كتابة الأخبار القصيرة للمحمول SMSكتابة الأخبار القصيرة للمحمول SMS
كتابة الأخبار القصيرة للمحمول SMS
 
Q1 2012 investor_presentation_may_2012
Q1 2012  investor_presentation_may_2012Q1 2012  investor_presentation_may_2012
Q1 2012 investor_presentation_may_2012
 
JHResume 05_06_2015
JHResume 05_06_2015JHResume 05_06_2015
JHResume 05_06_2015
 
про нформац ю
про  нформац юпро  нформац ю
про нформац ю
 
Origen i evolució de l'univers
Origen i evolució de l'universOrigen i evolució de l'univers
Origen i evolució de l'univers
 

Similar to Machine Translation Quality Estimation - A Linguist's Approach

Analytics Boot Camp - Slides
Analytics Boot Camp - SlidesAnalytics Boot Camp - Slides
Analytics Boot Camp - SlidesAditya Joshi
 
machine translation evaluation resources and methods: a survey
machine translation evaluation resources and methods: a surveymachine translation evaluation resources and methods: a survey
machine translation evaluation resources and methods: a surveyLifeng (Aaron) Han
 
Tech capabilities with_sa
Tech capabilities with_saTech capabilities with_sa
Tech capabilities with_saRobert Martin
 
Meta-evaluation of machine translation evaluation methods
Meta-evaluation of machine translation evaluation methodsMeta-evaluation of machine translation evaluation methods
Meta-evaluation of machine translation evaluation methodsLifeng (Aaron) Han
 
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...Lifeng (Aaron) Han
 
Exploring Text Revision with Backspace and Caret in Virtual Reality
Exploring Text Revision with Backspace and Caret in Virtual RealityExploring Text Revision with Backspace and Caret in Virtual Reality
Exploring Text Revision with Backspace and Caret in Virtual Realityivaderivader
 
Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia VoulibasiISSEL
 
Exploiting Distributional Semantic Models in Question Answering
Exploiting Distributional Semantic Models in Question AnsweringExploiting Distributional Semantic Models in Question Answering
Exploiting Distributional Semantic Models in Question AnsweringPierpaolo Basile
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017Manish Pandey
 
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...Quality Estimation for Machine Translation Using the Joint Method of Evaluati...
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...Lifeng (Aaron) Han
 
Recsys 2018 overview and highlights
Recsys 2018 overview and highlightsRecsys 2018 overview and highlights
Recsys 2018 overview and highlightsSandra Garcia
 
Language Models for Information Retrieval
Language Models for Information RetrievalLanguage Models for Information Retrieval
Language Models for Information RetrievalNik Spirin
 
Consistent Transformation of Ratio Metrics for Efficient Online Controlled Ex...
Consistent Transformation of Ratio Metrics for Efficient Online Controlled Ex...Consistent Transformation of Ratio Metrics for Efficient Online Controlled Ex...
Consistent Transformation of Ratio Metrics for Efficient Online Controlled Ex...Weitao Duan
 
Software Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSoftware Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSung Kim
 
An introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxAn introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxElasticsearch
 

Similar to Machine Translation Quality Estimation - A Linguist's Approach (20)

Analytics Boot Camp - Slides
Analytics Boot Camp - SlidesAnalytics Boot Camp - Slides
Analytics Boot Camp - Slides
 
machine translation evaluation resources and methods: a survey
machine translation evaluation resources and methods: a surveymachine translation evaluation resources and methods: a survey
machine translation evaluation resources and methods: a survey
 
Tech capabilities with_sa
Tech capabilities with_saTech capabilities with_sa
Tech capabilities with_sa
 
Meta-evaluation of machine translation evaluation methods
Meta-evaluation of machine translation evaluation methodsMeta-evaluation of machine translation evaluation methods
Meta-evaluation of machine translation evaluation methods
 
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
Meta-Evaluation of Translation Evaluation Methods: a systematic up-to-date ov...
 
Exploring Text Revision with Backspace and Caret in Virtual Reality
Exploring Text Revision with Backspace and Caret in Virtual RealityExploring Text Revision with Backspace and Caret in Virtual Reality
Exploring Text Revision with Backspace and Caret in Virtual Reality
 
Triantafyllia Voulibasi
Triantafyllia VoulibasiTriantafyllia Voulibasi
Triantafyllia Voulibasi
 
Exploiting Distributional Semantic Models in Question Answering
Exploiting Distributional Semantic Models in Question AnsweringExploiting Distributional Semantic Models in Question Answering
Exploiting Distributional Semantic Models in Question Answering
 
Keynote at IWLS 2017
Keynote at IWLS 2017Keynote at IWLS 2017
Keynote at IWLS 2017
 
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...Quality Estimation for Machine Translation Using the Joint Method of Evaluati...
Quality Estimation for Machine Translation Using the Joint Method of Evaluati...
 
Recsys 2018 overview and highlights
Recsys 2018 overview and highlightsRecsys 2018 overview and highlights
Recsys 2018 overview and highlights
 
Language Models for Information Retrieval
Language Models for Information RetrievalLanguage Models for Information Retrieval
Language Models for Information Retrieval
 
Consistent Transformation of Ratio Metrics for Efficient Online Controlled Ex...
Consistent Transformation of Ratio Metrics for Efficient Online Controlled Ex...Consistent Transformation of Ratio Metrics for Efficient Online Controlled Ex...
Consistent Transformation of Ratio Metrics for Efficient Online Controlled Ex...
 
Software Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled DatasetsSoftware Defect Prediction on Unlabeled Datasets
Software Defect Prediction on Unlabeled Datasets
 
TAUS QE Summit 2017 eBay EN-DE MT Pilot
TAUS QE Summit 2017   eBay EN-DE MT PilotTAUS QE Summit 2017   eBay EN-DE MT Pilot
TAUS QE Summit 2017 eBay EN-DE MT Pilot
 
An introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxAn introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolbox
 
TOIN - TAUS Tokyo Forum 2015
TOIN - TAUS Tokyo Forum 2015TOIN - TAUS Tokyo Forum 2015
TOIN - TAUS Tokyo Forum 2015
 
5. bleu
5. bleu5. bleu
5. bleu
 
0 introduction
0  introduction0  introduction
0 introduction
 
Nl201609
Nl201609Nl201609
Nl201609
 

Recently uploaded

Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...shivangimorya083
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 

Recently uploaded (20)

Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...Vip Model  Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
Vip Model Call Girls (Delhi) Karol Bagh 9711199171✔️Body to body massage wit...
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 

Machine Translation Quality Estimation - A Linguist's Approach

  • 2. WHAT IS MT QUALITY ESTIMATION? Automatically providing a quality indicator for machine translation output without depending on human reference translations. Our objective: Estimate quality and post-editing effort for eBay listing titles and descriptions MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 2
  • 3. ONE big CHALLENGE min W Ʃ T t=1 ||(W(t)X(t) − Y (t) )||2 2 + λs||S||1 + λb||B||1,∞ subject to: W = S + B or “State-of-the-art QE explores different supervised linear or non-linear learning methods for regression or classification such as Support Vector Machines (SVM), different types of Decision Trees, Neural Networks, Elastic-Net, Gaussian Processes, Naive Bayes, among others” (Machine Translation Quality Estimation Across Domains, de Souza et al, 2014) MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 3
  • 4. A LINGUIST’S APPROACH Using linguistic features from 3 dimensions: MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 4 COMPLEXITY ADEQUACY FLUENCY
  • 5. FEATURES Complexity: • Length • Polysemy MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 5 Adequacy: • QA  Terminology  Patterns  Blacklist  Numbers • Automated Post-Editing • (POS) • (NER) Fluency: • Misspellings • Grammar errors
  • 6. IMPLEMENTATION Checkmate+LanguageTool MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 6 Reusable Profile Detailed Report Score
  • 7. TESTING • One Language (es-LA) • Short samples (~300 words) • Bigger samples (~1000 words) • Post-Edited files (~50,000 words) • pt-BR, ru-RU, zh-CN MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 7
  • 9. MEASURING RESULTS MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 9
  • 10. SAMPLES - SCORE AND TIME ALIGN MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 10
  • 11. FILES - SCORE AND ED ALIGN MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 11 Average ED (es-LA, descriptions) = 72
  • 12. MT QE OVER TIME MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 12
  • 13. SAMPLES - OTHER LANGUAGES MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 13
  • 14. CHALLENGES • False positives • Matching score and post-editing effort • Same weight for all features MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 14
  • 15. WHAT’S NEXT • Tracking scores over time • Adding scores to our post-editing tool • Adding new languages • Researching new features MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 15
  • 16. HOW CAN YOU USE THIS? • Tailor the model to your needs • Estimate quality at the file/segment level • Target post-editing, discard bad content • Estimate post-editing effort/time • Compare MT systems • Monitor MT system progress MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 16
  • 17. Q&A THANK YOU! jrowda@ebay.com MT QUALITY ESTIMATION – A LINGUIST’S APPROACH 17

Editor's Notes

  1. Introduction