Enviar búsqueda
Cargar
Reciprocal Enrichment between Wikipedia and Machine Translators
•
0 recomendaciones
•
269 vistas
M
Mikel Iturbe
Seguir
The slides of the talk given in Wikimania 2010 in Gdansk, Poland.
Leer menos
Leer más
Tecnología
Empresariales
Denunciar
Compartir
Denunciar
Compartir
1 de 41
Descargar ahora
Descargar para leer sin conexión
Recomendados
Software lokalizazioa: Zer? Nola? Nork?
Software lokalizazioa: Zer? Nola? Nork?
Mikel Iturbe
Promoting the Use of Basque via Language Technology
Promoting the Use of Basque via Language Technology
techiaith
Transcribe Bentham
Transcribe Bentham
Franny Gaede
Parc floss-wikipedia
Parc floss-wikipedia
José Felipe Ortega
Web Metaphysics between Logic and Ontology
Web Metaphysics between Logic and Ontology
PhiloWeb
Increasing access to free and open knowledge for speakers of underserved lang...
Increasing access to free and open knowledge for speakers of underserved lang...
Lucie-Aimée Kaffee
Research at RMOD
Research at RMOD
Marcus Denker
Multilingual challenges in Europeana
Multilingual challenges in Europeana
Antoine Isaac
Recomendados
Software lokalizazioa: Zer? Nola? Nork?
Software lokalizazioa: Zer? Nola? Nork?
Mikel Iturbe
Promoting the Use of Basque via Language Technology
Promoting the Use of Basque via Language Technology
techiaith
Transcribe Bentham
Transcribe Bentham
Franny Gaede
Parc floss-wikipedia
Parc floss-wikipedia
José Felipe Ortega
Web Metaphysics between Logic and Ontology
Web Metaphysics between Logic and Ontology
PhiloWeb
Increasing access to free and open knowledge for speakers of underserved lang...
Increasing access to free and open knowledge for speakers of underserved lang...
Lucie-Aimée Kaffee
Research at RMOD
Research at RMOD
Marcus Denker
Multilingual challenges in Europeana
Multilingual challenges in Europeana
Antoine Isaac
TraduXio project - Cosi10
TraduXio project - Cosi10
PhilippeLacour
Learning and Text Analysis for Ontology Engineering
Learning and Text Analysis for Ontology Engineering
butest
M&L 2012 - Translectures: tackling the translation issue in a cost effective ...
M&L 2012 - Translectures: tackling the translation issue in a cost effective ...
Media & Learning Conference
The META-NET Strategic Research Agenda for Multilingual Europe 2020
The META-NET Strategic Research Agenda for Multilingual Europe 2020
Georg Rehm
Organising a GLAM wiki
Organising a GLAM wiki
Europeana_Sounds
Niatalk24jan10
Niatalk24jan10
Sunita Barve
LIASCD_carriero
LIASCD_carriero
Corinne Carriero
Wikidata - The free knowledge base that anyone can edit (1st Linked Data Meet...
Wikidata - The free knowledge base that anyone can edit (1st Linked Data Meet...
Anja Jentzsch
eLanguage.net: Shifting the paradigm in Linguistics
eLanguage.net: Shifting the paradigm in Linguistics
Cornelius Puschmann
Olf2016
Olf2016
Dru Lavigne
Community SUmmit: Legal & Licensing / Tools for developers to ensure legal in...
Community SUmmit: Legal & Licensing / Tools for developers to ensure legal in...
Paris Open Source Summit
Anton Kasyanov, Introduction to Python, Lecture1
Anton Kasyanov, Introduction to Python, Lecture1
Anton Kasyanov
Tools for developers to ensure legal integrity of their code - Antelink OWF
Tools for developers to ensure legal integrity of their code - Antelink OWF
Antelink
Wikipedia : Workshop
Wikipedia : Workshop
NIFT
Why to Choose Python for Data Science Master.pptx
Why to Choose Python for Data Science Master.pptx
HGLLearn
Presentation OntoCommons Workshop March 2021
Presentation OntoCommons Workshop March 2021
INRAE (MISTEA) and University of Montpellier (LIRMM)
Traduco: A collaborative web-based CAT environment for the interpretation and...
Traduco: A collaborative web-based CAT environment for the interpretation and...
antonellarose
Improving writing aids, the community way
Improving writing aids, the community way
Alexandro Colorado
Models and tools for aggregating and annotating content on ECLAP
Models and tools for aggregating and annotating content on ECLAP
Paolo Nesi
EMMA presentation - Alfons Juan - Language technologies for Education: recent...
EMMA presentation - Alfons Juan - Language technologies for Education: recent...
EUmoocs
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
CzechDreamin
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
DianaGray10
Más contenido relacionado
Similar a Reciprocal Enrichment between Wikipedia and Machine Translators
TraduXio project - Cosi10
TraduXio project - Cosi10
PhilippeLacour
Learning and Text Analysis for Ontology Engineering
Learning and Text Analysis for Ontology Engineering
butest
M&L 2012 - Translectures: tackling the translation issue in a cost effective ...
M&L 2012 - Translectures: tackling the translation issue in a cost effective ...
Media & Learning Conference
The META-NET Strategic Research Agenda for Multilingual Europe 2020
The META-NET Strategic Research Agenda for Multilingual Europe 2020
Georg Rehm
Organising a GLAM wiki
Organising a GLAM wiki
Europeana_Sounds
Niatalk24jan10
Niatalk24jan10
Sunita Barve
LIASCD_carriero
LIASCD_carriero
Corinne Carriero
Wikidata - The free knowledge base that anyone can edit (1st Linked Data Meet...
Wikidata - The free knowledge base that anyone can edit (1st Linked Data Meet...
Anja Jentzsch
eLanguage.net: Shifting the paradigm in Linguistics
eLanguage.net: Shifting the paradigm in Linguistics
Cornelius Puschmann
Olf2016
Olf2016
Dru Lavigne
Community SUmmit: Legal & Licensing / Tools for developers to ensure legal in...
Community SUmmit: Legal & Licensing / Tools for developers to ensure legal in...
Paris Open Source Summit
Anton Kasyanov, Introduction to Python, Lecture1
Anton Kasyanov, Introduction to Python, Lecture1
Anton Kasyanov
Tools for developers to ensure legal integrity of their code - Antelink OWF
Tools for developers to ensure legal integrity of their code - Antelink OWF
Antelink
Wikipedia : Workshop
Wikipedia : Workshop
NIFT
Why to Choose Python for Data Science Master.pptx
Why to Choose Python for Data Science Master.pptx
HGLLearn
Presentation OntoCommons Workshop March 2021
Presentation OntoCommons Workshop March 2021
INRAE (MISTEA) and University of Montpellier (LIRMM)
Traduco: A collaborative web-based CAT environment for the interpretation and...
Traduco: A collaborative web-based CAT environment for the interpretation and...
antonellarose
Improving writing aids, the community way
Improving writing aids, the community way
Alexandro Colorado
Models and tools for aggregating and annotating content on ECLAP
Models and tools for aggregating and annotating content on ECLAP
Paolo Nesi
EMMA presentation - Alfons Juan - Language technologies for Education: recent...
EMMA presentation - Alfons Juan - Language technologies for Education: recent...
EUmoocs
Similar a Reciprocal Enrichment between Wikipedia and Machine Translators
(20)
TraduXio project - Cosi10
TraduXio project - Cosi10
Learning and Text Analysis for Ontology Engineering
Learning and Text Analysis for Ontology Engineering
M&L 2012 - Translectures: tackling the translation issue in a cost effective ...
M&L 2012 - Translectures: tackling the translation issue in a cost effective ...
The META-NET Strategic Research Agenda for Multilingual Europe 2020
The META-NET Strategic Research Agenda for Multilingual Europe 2020
Organising a GLAM wiki
Organising a GLAM wiki
Niatalk24jan10
Niatalk24jan10
LIASCD_carriero
LIASCD_carriero
Wikidata - The free knowledge base that anyone can edit (1st Linked Data Meet...
Wikidata - The free knowledge base that anyone can edit (1st Linked Data Meet...
eLanguage.net: Shifting the paradigm in Linguistics
eLanguage.net: Shifting the paradigm in Linguistics
Olf2016
Olf2016
Community SUmmit: Legal & Licensing / Tools for developers to ensure legal in...
Community SUmmit: Legal & Licensing / Tools for developers to ensure legal in...
Anton Kasyanov, Introduction to Python, Lecture1
Anton Kasyanov, Introduction to Python, Lecture1
Tools for developers to ensure legal integrity of their code - Antelink OWF
Tools for developers to ensure legal integrity of their code - Antelink OWF
Wikipedia : Workshop
Wikipedia : Workshop
Why to Choose Python for Data Science Master.pptx
Why to Choose Python for Data Science Master.pptx
Presentation OntoCommons Workshop March 2021
Presentation OntoCommons Workshop March 2021
Traduco: A collaborative web-based CAT environment for the interpretation and...
Traduco: A collaborative web-based CAT environment for the interpretation and...
Improving writing aids, the community way
Improving writing aids, the community way
Models and tools for aggregating and annotating content on ECLAP
Models and tools for aggregating and annotating content on ECLAP
EMMA presentation - Alfons Juan - Language technologies for Education: recent...
EMMA presentation - Alfons Juan - Language technologies for Education: recent...
Último
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
CzechDreamin
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
DianaGray10
Agentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdf
ChristopherTHyatt
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
CzechDreamin
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Peter Udo Diehl
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
Syngulon
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
FIDO Alliance
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
Mark Opanasiuk
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
CzechDreamin
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
confluent
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
IES VE
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
FIDO Alliance
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
CzechDreamin
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
CzechDreamin
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
CzechDreamin
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
FIDO Alliance
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
David Michel
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
IoTAnalytics
Connecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAK
UXDXConf
Buy Epson EcoTank L3210 Colour Printer Online.pdf
Buy Epson EcoTank L3210 Colour Printer Online.pdf
EasyPrinterHelp
Último
(20)
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
Agentic RAG What it is its types applications and implementation.pdf
Agentic RAG What it is its types applications and implementation.pdf
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
10 Differences between Sales Cloud and CPQ, Blanka Doktorová
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Syngulon - Selection technology May 2024.pdf
Syngulon - Selection technology May 2024.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
Speed Wins: From Kafka to APIs in Minutes
Speed Wins: From Kafka to APIs in Minutes
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
Salesforce Adoption – Metrics, Methods, and Motivation, Antone Kom
AI revolution and Salesforce, Jiří Karpíšek
AI revolution and Salesforce, Jiří Karpíšek
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
IoT Analytics Company Presentation May 2024
IoT Analytics Company Presentation May 2024
Connecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAK
Buy Epson EcoTank L3210 Colour Printer Online.pdf
Buy Epson EcoTank L3210 Colour Printer Online.pdf
Reciprocal Enrichment between Wikipedia and Machine Translators
1.
Reciprocal Enrichment
between Wikipedia and Machine Translators OpenMT2 project Mikel Iturbe Wikimania 2010 Gdańsk, Poland
2.
languages in
wikipedia
3.
Distribution of wikipedia
articles by language English German French Polish Italian Japanese Spanish Dutch Other
4.
Less than 1% of
languages have more than 50% of articles
5.
Can we ease good
article creation?
6.
How can we boost
article creation in minority languages?
7.
OpenMT2 project
http://ixa.si.ehu.es/openmt2/
8.
What is it?
9.
EHU, UPC and Basque wikipedians
10.
Funded by the
Spanish government
11.
Free
12.
Hybrid Machine
Translation and advanced evaluation system
13.
Hybrid?
14.
Rule-based MT
+ Statistical post-editing
15.
The aim: To teach the
existing MT to correct it's own mistakes when translating
16.
Using wikipedia
17.
How?
18.
(1)
19.
Translate using
rulebased MatxinOpentrad http://opentrad.com/
20.
100 long articles
es eu
21.
(2)
22.
Correct Basque
output manually
23.
(3)
24.
Analyze logs
25.
(4)
26.
Make
improvements to the MT system
27.
28.
Final test and
results
29.
Tools
30.
Google translator
toolkit
31.
Specific help for wikipedia
Not Free Software
32.
OmegaT
http://omegat.org
33.
Suitable to do the job
Free software
34.
What's in?
35.
100 new and good
articles for the Basque Wikipedia
36.
Provide research
material
37.
Walk towards a MT
system that can be used in our wikipedia
38.
Thank you.
39.
Aurélio A. Heckert (source), David Vignoni (source),
Wilfredor (source), Tango project & Arkanosis (source) , OmegaT project (source) Image credits
40.
email: mikel@hamahiru.org
User page: http://eu.wikipedia.org/wiki/Lankide:Janfri Address: http://hamahiru.org/media/wikimania2010.pdf contact
41.
Text licensed under
ccbysa 3.0 images maintain their original licenses
Descargar ahora