Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

Enabling Language Resources to Expose Translations as Linked Data on the Web

534 visualizaciones

Publicado el

Language resources, such as multilingual lexica and multilingual electronic dictionaries, contain collections of lexical entries in several languages. Having access to the corresponding explicit or implicit translation relations between such entries might be of great interest for many NLP-based applications. By using Semantic Web-based techniques, translations can be available on the Web to be consumed by other (semantic enabled) resources in a direct manner, not relying on application-specific formats. To that end, in this paper we propose a model for representing translations as linked data, as an extension of the lemon model. Our translation module represents some core information associated to term translations and does not commit to specific views or translation theories. As a proof of concept, we have extracted the translations of the terms contained in Terminesp, a multilingual terminological database, and represented them as linked data. We have made them accessible on the Web both for humans (via a Web interface) and software agents (with a SPARQL endpoint).

  • Sé el primero en comentar

  • Sé el primero en recomendar esto

Enabling Language Resources to Expose Translations as Linked Data on the Web

  1. 1. Enabling Language Resources to Expose Translations as Linked Data on the Web Jorge Gracia, Elena Montiel-Ponsoda, Daniel Vila-Suero, Guadalupe Aguado-de-Cea Ontology Engineering Group (OEG) Universidad Politécnica de Madrid (UPM) Acknowledgments: LIDER and BabeLData projects 9th Language Resources and Evaluation Conference, LREC 2014 Reykjavik (Iceland) 28/05/2014
  2. 2. Outline Motivation The translation model Terminesp: a validating example Conclusions 2
  3. 3. 3 Motivation and goals
  4. 4. Motivation Current multilingual lexica and electronic dictionaries • Proprietary formats • Non-standard APIs • Disconnected from other resources 4
  5. 5. Motivation GOAL: to allow language resources to expose translations as Linked Data on the Web for their consumption by semantic enabled applications in a direct manner, not relying on application-specific formats 5
  6. 6. Motivation Objectives: • To define a model for representing translations in RDF • As a proof of concept: 1. Extract translations from the Terminesp terminological database 2. Represent them in RDF with our model 3. Make them accessible both for human and machine consumption 6
  7. 7. 7 The translation model
  8. 8. The translation model 8
  9. 9. The translation model 9
  10. 10. LEXICONES LEXICONEN LexicalEntry LexicalSense LexicalEntry LexicalSense ONTOLOGY “payment method” “medio de pago” The translation model Translation (direct equivalent) 10
  11. 11. LEXICONES LEXICONEN LexicalEntry LexicalSense LexicalEntry LexicalSense ONTOLOGY “Prime Minister” “Presidente del Gobierno” ONTOLOGY The translation model Translation (Cultural equivalence) 11
  12. 12. The translation model Characteristics of the model • Translation as a relation between senses • Translation relation reified  additional information can be attached to it • Support to a variety of translation categories • Translation categories clearly separated from the model  no commitment to specific views or translation theories • Translation sets group translations coming from the same language resource, or belonging to the same organization, for instance • Re-use of well established vocabularies (DC, DCAT, etc.) for provenance and additional information. 12
  13. 13. LexicalSense tran translationTarget context TranslationSet Translation translationConfidence:double The translation model Translation Categories translationCategory context Resource Translation Module translationSource directEquivalent culturalEquivalent lexicalEquivalent 13
  14. 14. 14 Terminesp, a validating example
  15. 15. Terminesp, a validating example TERMINESP • Multilingual terminological database • Terms and definitions from Spanish technological standards • More than 30K terms in Spanish, with translations into English, German, French, Italian, … 15
  16. 16. lemon:LexicalEntry terminesp:38756es lemon:LexicalEntry terminesp:38756en lemon:LexicalSense terminesp:38756es-sense lemon:LexicalSense terminesp:38756en-sense skos:Concept terminesp:38756 lemon:Lexicon terminesp:lexiconES lemon:Lexicon terminesp:lexiconEN tr:Translation terminesp:38756es-en-TR “red”@es “network”@en lemon:entry lemon:entry lemon:sense lemon:sense tr:translationTarget tr:translationSource lemon:reference lemon:reference Class Instance Legend lemon:form lemon:form lemon:LexicalForm lemon:writtenRep lemon:writtenRep lemon:LexicalForm Terminesp, a validating example 16
  17. 17. lemon:LexicalSense terminesp:38756es-sense lemon:LexicalSense terminesp:38756en-sense Tr:TranslationSet terminesp:es-en-transet tr:Translation terminesp:38756es-en-TR tr:translationCategory tr:translationTarget tr:translationSource Class Instance Legend tr:tran trcat:directEquivalent Terminesp, a validating example 17
  18. 18. Before • MS Access database and a Web search interface • Non standard formats and vocabularies • Data “invisible” to software agents • Translations implicit, not explicit Terminesp, a validating example 18
  19. 19. Now • Published on the Web as Linked Data • Modelled using lemon and well established vocabularies • Dereferenceable URIs • Data “visible” to software agents • Translations were made explicit • Web search interface for human consumption • SPARQL endpoint for machine consumption Terminesp, a validating example 19
  20. 20. Terminesp for machine consumption – SPARQL endpoint Terminesp, a validating example 20
  21. 21. Terminesp for machine consumption – SPARQL endpoint Written representation target Lexicon target network Netzwerk (in der Netzwerktopologie) Terminesp, a validating example 21
  22. 22. Terminesp for human consumption – Web interface Terminesp, a validating example 22
  23. 23. 23 Conclusions
  24. 24. Conclusions 24 Our proposal • Model to represent translations as Linked Data on the Web • Terminesp as a validating example Next steps • Standardization through W3C Ontolex Community group • Study possible reuse of ITS 2.0 elements • Links of Terminesp to external resources (e.g., BabelNet) 24
  25. 25. Thanks for your attention ! 25