SlideShare una empresa de Scribd logo
1 de 28
Descargar para leer sin conexión
SESSION ON
MULTILINGUAL VOCABULARY
Development and extension
DanielVila-Suero (dvila@fi.upm.es)
Ontology Engineering Group, UPM (Madrid)
Slides available at
GOAL
Open discussion around vocabularies enabled for
multilingual environments (WWW)
Introduce some examples: current situation and efforts.
More open questions than answers.
Promote collaboration
ß
SESSION OUTLINE
1. Introduction to the session and the topic
2. “Representing multilingual lexical and
terminological information in RDF vocabularies”
Elena Montiel-Ponsoda, OEG-UPM
3. “Metadata registry of the Publications office of the
EU”
Michael Düro. PO-EU
OPENDISCUSSION
OPEN DISCUSSION
THISTALK
1. Why should we care about multilingual vocabularies?
2. What is a multilingual vocabulary?
3. Current situation: when and who
WHY
The primary design principle underlying the Web’s
usefulness and growth is universality.When you make
a link, you can link to anything.That means people
must be able to put anything on the Web, no matter
what computer they have, software they use or human
language they speak…
Tim Berners-Lee
ß
ß	

	

WHY
The primary design principle underlying the Web’s
usefulness and growth is universality.When you make
a link, you can link to anything.That means people
must be able to put anything on the Web, no matter
what computer they have, software they use or human
language they speak…
Tim Berners-LeeVocabularies are becoming a central part of the WWW
LANGUAGES ARE USEFUL
•For Humans
★Finding vocabularies, terms, etc.
★Understading their semantics, how to use them
★...
•and Machines...
★ Search, ranking, resource discovery
★Natural Language Processing applications: multilingual
question answering, localized presentation of data
★....
WHY
Search for
プロジェクト
WHY
24 ranked results
including the term
project in Japanese
SOME FACTS ABOUT LOV
• Data retrieved 12.04.2013* out of 326 vocabs
0
75
150
225
300
Monolingual (EN) Monolingual (No EN) Multilingual Non especified
425358
223
Number of vocabularies
* “Guidelines for Multilingual Linked Data” Gómez-Pérez et al., 2013
SOME FACTS ABOUT LOV
• LOV loves multilingual descriptions: indexing, ranked
search results.
•But, still very low usage of language tags for
vocabulary elements < 60%
•Other semantic search engines (Sindice, Falcons, SWSE..)
lack support for multiple languages
WHAT IS AN MLVOCAB?
• Simple (general) answer:
“A vocabulary which includes labels and
documentation in multiple languages”
•Are there other flavors of multilingual vocabularies?
FLAVORS
1
1. LABELLING
@en
@es
FLAVORS
2. EXTERNAL
MODEL
personal docente
profesor titular
catedrático
academic staff
associate professor
(full) professor
Lehrpersonal
Privatdozent
Professor
P01
P011
P013
P012
Legend!
subClassOf!
Mappings!
ISSUES:
Directionality of links, different namespaces, resolution of
URIs (at http level with header, htaccess, external
service..-)
FLAVORS
3. MAPPING MODEL
cuerpo docente
catedrático
Lehrpersonal
profesor
contratado
assistant
professor
associate
professor
full
professor
Dozent
…
academic staff
P011 P013P012
profesor titular
…
…
P01
Legend!
subClassOf!
Mappings!
Professor
wissenschaftlicher
Mitarbeiter Privatdozent…
WHAT FLAVOR IS MINE ?
WHAT FLAVOR IS MINE?
• Depends on a number of factors:
★Your starting point (starting from scratch? can you
modify the terms within your original namespace? are
there similar vocabularies in other langs?)
★Your needs (linguistically complex model, simplicity,
efficiency, et)
★Your available resources (time, people, money...)
★.........
•Selection should be USE CASE DRIVEN
LAYERED FRAMEWORK
PROCESS
REPRESENTATION
POLICY
INFRASTRUCTURE
ORGANIZATIONAL
TECHNICAL
POLICY
POLICY
★Vocabulary publishers should commit to
a translation policy:
e.g.,What are the protocols for including/
developing/validating a new translation?
★Establish the neccesary mechanisms to
manage and assess the quality,
sinchronization and appropriate
coverage between different languages.
★Again, should be based on
requirements, goals, etc. and be UC
driven
PROCESS
PROCESS
★Translation workflows: versioning,
notification, edition, validation
mechanisms, etc.
★Develop methodologies,
guidelines and best practices for
translating and including new languages.
★Establish communication
protocols between the responsibles of
the different translations (languages)
★Coordination among the people
involved
REPRESENTATION
REPRESENTATION
★Choose your modelling approach:
★rdfs and skos labels and descriptions
★Specialized models (lemon, ontolex
etc.)
★ Mappings
★ Guidelines for:
★ Naming: coining new URIs for
terms
★Labeling: Defining the structure
of the labels (should we use verbs, full
sentences, etc.)
INFRASTRUCTURE
INFRASTRUCTURE
★Manage different aspects:
★Management of translation/
edition workflows: notifications, review
process, versioning, etc.
★Access to vocabulary elements:
localize access? different namespace for
the linguistic descriptions?
★Generation of human-readable
documentation
★Look at MLOD patterns and
guidelines
WHEN AND WHO
• Learn from (succesful) initiatives:
★ FAO’s AGROVOC
★ EUROVOC
★WORDNET
★ IFLAVocabularies and Guidelines for translations
★....
•Get involved in initiatives around the topic:
★W3C Internationalization Activity
★W3C Best practices for Multilingual LOD CG
★W3C Ontology-Lexica CG
★EU Lider project
W3C BPMLOD
REPRESENTATION
INFRASTRUCTURE
Use cases wanted!
W3C ONTO-LEX
REPRESENTATION
LIDER-PROJECT.EU
Linguistic Linked Data (including vocabularies)
can serve as an enabler technology for content analytics
on the Multilingual Web.
Universidad Politécnica de Madrid (Spain) Trinity College (Ireland)
DFKI (Germany) National University of Ireland , Galway (Ireland)
Institut für Angewandte Informatik (Germany)
Universität Bielefeld (Germany)
Universita Roma la Sapienza (Italy) W3C/ERCIM (France)
LIDER-PROJECT.EU
• Development of best practices and guidelines for
publishing multilingual linked data resources (including
vocabularies).
•Events: W3C Multilingual Web workshop, hackhathons,
industrial events, etc.
•Help organizations with publishing Multilingual
Linked Data resources
Get involved!
THANK YOU VERY MUCH
dvila@fi.upm, @dvilasuero

Más contenido relacionado

Similar a Multilingual vocabularies for the Web: Session on multilingual vocabularies, VocDay, DC conference Lisbon 2013

Comp app lexicography
Comp app lexicographyComp app lexicography
Comp app lexicography
syila239
 
Linq 2013 session_red_1_diadori_peppoloni
Linq 2013 session_red_1_diadori_peppoloniLinq 2013 session_red_1_diadori_peppoloni
Linq 2013 session_red_1_diadori_peppoloni
LINQ_Conference
 
4. Edurne Ecay and Irune Labiano - Lack of a theoretical framework
4. Edurne Ecay  and Irune Labiano - Lack of a theoretical framework4. Edurne Ecay  and Irune Labiano - Lack of a theoretical framework
4. Edurne Ecay and Irune Labiano - Lack of a theoretical framework
Úcar Marian
 
Lack of a theoretical framework
Lack of a theoretical framework Lack of a theoretical framework
Lack of a theoretical framework
Edurne
 
Calico 2014 intelligent call - def
Calico 2014   intelligent call - defCalico 2014   intelligent call - def
Calico 2014 intelligent call - def
Piet Desmet
 

Similar a Multilingual vocabularies for the Web: Session on multilingual vocabularies, VocDay, DC conference Lisbon 2013 (20)

Sacodeyl Birmingham 2007
Sacodeyl Birmingham 2007Sacodeyl Birmingham 2007
Sacodeyl Birmingham 2007
 
OER: insights into a multilingual landscape - EUROCALL 2014 conference
OER: insights into a multilingual landscape - EUROCALL 2014 conference  OER: insights into a multilingual landscape - EUROCALL 2014 conference
OER: insights into a multilingual landscape - EUROCALL 2014 conference
 
Developing corpus-based resources for language learning: looking back in "hope"
Developing corpus-based resources for language learning: looking back in "hope"Developing corpus-based resources for language learning: looking back in "hope"
Developing corpus-based resources for language learning: looking back in "hope"
 
Comp app lexicography
Comp app lexicographyComp app lexicography
Comp app lexicography
 
The World Is Not Flat (Rossomondo & Lord, ACTFL2015)
The World Is Not Flat (Rossomondo & Lord, ACTFL2015)The World Is Not Flat (Rossomondo & Lord, ACTFL2015)
The World Is Not Flat (Rossomondo & Lord, ACTFL2015)
 
Linq 2013 session_red_1_diadori_peppoloni
Linq 2013 session_red_1_diadori_peppoloniLinq 2013 session_red_1_diadori_peppoloni
Linq 2013 session_red_1_diadori_peppoloni
 
Embedding_OER_Into_Your_Learning_Managemen_System
Embedding_OER_Into_Your_Learning_Managemen_SystemEmbedding_OER_Into_Your_Learning_Managemen_System
Embedding_OER_Into_Your_Learning_Managemen_System
 
TDWG VoMaG Vocabulary management workflow, 2013-10-31
TDWG VoMaG Vocabulary management workflow, 2013-10-31TDWG VoMaG Vocabulary management workflow, 2013-10-31
TDWG VoMaG Vocabulary management workflow, 2013-10-31
 
Definiendo el enfoque lfe
Definiendo el enfoque lfeDefiniendo el enfoque lfe
Definiendo el enfoque lfe
 
Presentation of Adaptive Software at CLIL 2010 Conference
Presentation of Adaptive Software at CLIL 2010 ConferencePresentation of Adaptive Software at CLIL 2010 Conference
Presentation of Adaptive Software at CLIL 2010 Conference
 
Word processors in the classroom
Word processors in the classroomWord processors in the classroom
Word processors in the classroom
 
ESP for PR-managers
ESP for PR-managersESP for PR-managers
ESP for PR-managers
 
4. Edurne Ecay and Irune Labiano - Lack of a theoretical framework
4. Edurne Ecay  and Irune Labiano - Lack of a theoretical framework4. Edurne Ecay  and Irune Labiano - Lack of a theoretical framework
4. Edurne Ecay and Irune Labiano - Lack of a theoretical framework
 
Lack of a theoretical framework
Lack of a theoretical framework Lack of a theoretical framework
Lack of a theoretical framework
 
Corpus Construction & Specialist Vocabulary Learning
Corpus Construction & Specialist Vocabulary LearningCorpus Construction & Specialist Vocabulary Learning
Corpus Construction & Specialist Vocabulary Learning
 
Calico 2014 intelligent call - def
Calico 2014   intelligent call - defCalico 2014   intelligent call - def
Calico 2014 intelligent call - def
 
UDL Presentation
UDL PresentationUDL Presentation
UDL Presentation
 
Siop model-and-research-findings
Siop model-and-research-findingsSiop model-and-research-findings
Siop model-and-research-findings
 
Siop model-and-research-findings
Siop model-and-research-findingsSiop model-and-research-findings
Siop model-and-research-findings
 
OWN-PT: Taking Stock
OWN-PT: Taking Stock OWN-PT: Taking Stock
OWN-PT: Taking Stock
 

Último

1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
QucHHunhnh
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 

Último (20)

Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
1029-Danh muc Sach Giao Khoa khoi 6.pdf
1029-Danh muc Sach Giao Khoa khoi  6.pdf1029-Danh muc Sach Giao Khoa khoi  6.pdf
1029-Danh muc Sach Giao Khoa khoi 6.pdf
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 

Multilingual vocabularies for the Web: Session on multilingual vocabularies, VocDay, DC conference Lisbon 2013

  • 1. SESSION ON MULTILINGUAL VOCABULARY Development and extension DanielVila-Suero (dvila@fi.upm.es) Ontology Engineering Group, UPM (Madrid) Slides available at
  • 2. GOAL Open discussion around vocabularies enabled for multilingual environments (WWW) Introduce some examples: current situation and efforts. More open questions than answers. Promote collaboration ß
  • 3. SESSION OUTLINE 1. Introduction to the session and the topic 2. “Representing multilingual lexical and terminological information in RDF vocabularies” Elena Montiel-Ponsoda, OEG-UPM 3. “Metadata registry of the Publications office of the EU” Michael Düro. PO-EU OPENDISCUSSION OPEN DISCUSSION
  • 4. THISTALK 1. Why should we care about multilingual vocabularies? 2. What is a multilingual vocabulary? 3. Current situation: when and who
  • 5. WHY The primary design principle underlying the Web’s usefulness and growth is universality.When you make a link, you can link to anything.That means people must be able to put anything on the Web, no matter what computer they have, software they use or human language they speak… Tim Berners-Lee ß
  • 6. ß WHY The primary design principle underlying the Web’s usefulness and growth is universality.When you make a link, you can link to anything.That means people must be able to put anything on the Web, no matter what computer they have, software they use or human language they speak… Tim Berners-LeeVocabularies are becoming a central part of the WWW
  • 7. LANGUAGES ARE USEFUL •For Humans ★Finding vocabularies, terms, etc. ★Understading their semantics, how to use them ★... •and Machines... ★ Search, ranking, resource discovery ★Natural Language Processing applications: multilingual question answering, localized presentation of data ★....
  • 9. WHY 24 ranked results including the term project in Japanese
  • 10. SOME FACTS ABOUT LOV • Data retrieved 12.04.2013* out of 326 vocabs 0 75 150 225 300 Monolingual (EN) Monolingual (No EN) Multilingual Non especified 425358 223 Number of vocabularies * “Guidelines for Multilingual Linked Data” Gómez-Pérez et al., 2013
  • 11. SOME FACTS ABOUT LOV • LOV loves multilingual descriptions: indexing, ranked search results. •But, still very low usage of language tags for vocabulary elements < 60% •Other semantic search engines (Sindice, Falcons, SWSE..) lack support for multiple languages
  • 12. WHAT IS AN MLVOCAB? • Simple (general) answer: “A vocabulary which includes labels and documentation in multiple languages” •Are there other flavors of multilingual vocabularies?
  • 14. FLAVORS 2. EXTERNAL MODEL personal docente profesor titular catedrático academic staff associate professor (full) professor Lehrpersonal Privatdozent Professor P01 P011 P013 P012 Legend! subClassOf! Mappings! ISSUES: Directionality of links, different namespaces, resolution of URIs (at http level with header, htaccess, external service..-)
  • 15. FLAVORS 3. MAPPING MODEL cuerpo docente catedrático Lehrpersonal profesor contratado assistant professor associate professor full professor Dozent … academic staff P011 P013P012 profesor titular … … P01 Legend! subClassOf! Mappings! Professor wissenschaftlicher Mitarbeiter Privatdozent…
  • 16. WHAT FLAVOR IS MINE ?
  • 17. WHAT FLAVOR IS MINE? • Depends on a number of factors: ★Your starting point (starting from scratch? can you modify the terms within your original namespace? are there similar vocabularies in other langs?) ★Your needs (linguistically complex model, simplicity, efficiency, et) ★Your available resources (time, people, money...) ★......... •Selection should be USE CASE DRIVEN
  • 19. POLICY POLICY ★Vocabulary publishers should commit to a translation policy: e.g.,What are the protocols for including/ developing/validating a new translation? ★Establish the neccesary mechanisms to manage and assess the quality, sinchronization and appropriate coverage between different languages. ★Again, should be based on requirements, goals, etc. and be UC driven
  • 20. PROCESS PROCESS ★Translation workflows: versioning, notification, edition, validation mechanisms, etc. ★Develop methodologies, guidelines and best practices for translating and including new languages. ★Establish communication protocols between the responsibles of the different translations (languages) ★Coordination among the people involved
  • 21. REPRESENTATION REPRESENTATION ★Choose your modelling approach: ★rdfs and skos labels and descriptions ★Specialized models (lemon, ontolex etc.) ★ Mappings ★ Guidelines for: ★ Naming: coining new URIs for terms ★Labeling: Defining the structure of the labels (should we use verbs, full sentences, etc.)
  • 22. INFRASTRUCTURE INFRASTRUCTURE ★Manage different aspects: ★Management of translation/ edition workflows: notifications, review process, versioning, etc. ★Access to vocabulary elements: localize access? different namespace for the linguistic descriptions? ★Generation of human-readable documentation ★Look at MLOD patterns and guidelines
  • 23. WHEN AND WHO • Learn from (succesful) initiatives: ★ FAO’s AGROVOC ★ EUROVOC ★WORDNET ★ IFLAVocabularies and Guidelines for translations ★.... •Get involved in initiatives around the topic: ★W3C Internationalization Activity ★W3C Best practices for Multilingual LOD CG ★W3C Ontology-Lexica CG ★EU Lider project
  • 26. LIDER-PROJECT.EU Linguistic Linked Data (including vocabularies) can serve as an enabler technology for content analytics on the Multilingual Web. Universidad Politécnica de Madrid (Spain) Trinity College (Ireland) DFKI (Germany) National University of Ireland , Galway (Ireland) Institut für Angewandte Informatik (Germany) Universität Bielefeld (Germany) Universita Roma la Sapienza (Italy) W3C/ERCIM (France)
  • 27. LIDER-PROJECT.EU • Development of best practices and guidelines for publishing multilingual linked data resources (including vocabularies). •Events: W3C Multilingual Web workshop, hackhathons, industrial events, etc. •Help organizations with publishing Multilingual Linked Data resources Get involved!
  • 28. THANK YOU VERY MUCH dvila@fi.upm, @dvilasuero