Slides for a presentation at the Flarenet Forum, http://www.flarenet.eu/?q=FLaReNet_Forum_2011 where I presented Europeana's work with multi-lingual access.
Europeana and multi-lingual access, challenges and possibilities
1. Europeana and multi-lingual access – challenges and possibilities FLaReNet Forum 2011 David Haskiya, Product Developer&Project Coordinator Europeana, www.europeana.eu
Europeana and multi-lingual access – challenges and possibilities First an intro to Europeana. What we are and what we are not, www.europeana.eu To identify the major challenges facing Europeana conerning multi-lingual access and t o sketch possible solutions Challenge: Ontologies and multi-lingual labelling of metadata Challenge: Query translation Challenge: Results translation of metadata Challenge: Localisation of the Europeana portal Painter: Lucas van Valckenborch File: http://commons.wikimedia.org/wiki/File:Valckenborch_tower-babel.jpg
Challenge: Ontologies Currently we use multi-lingual ontologies to create multi-lingual labels and index them for search Probably our main route forward However, it’s difficult to find ontologies and authority files that cover all Europeana languages (the EU 27 languages) Operational ontologies in Europeana: Dbpedia, GEMET, GeoNames Other ontologies we’re looking at: VIAF, LCSH We prefer openly licensed resources We prefer resources modelled in SKOS
Challenge: Query translation Under development Main efforts are part of EuropeanaConnect Work Packages 1 and 2 www.europeanaconnect.eu Basis is language identification Named entity recognition Licensed resources XEROX CELI Open resources Language resources registry http://europeanalabs.eu/wiki/LinguisticResourceRegister Inventory of vocabularies and language resources http://europeanalabs.eu/wiki/WP12Vocabularies http://europeanalabs.eu/wiki/WP2LanguageResources Google and Bing Translation APIs Very good at to/from English Evaluation of Proprietary vs. Open vs. Google/Bing (morphological/dictionaries)
Challenge: Results translation Already in production in the Europeana portal Commercial APIs the only practical option? Cover numeroous languages and have easy to work with, well documented APIs Problems: Can be shut down, as Google Translate that will be shut down December 2011 Are there open and free alternatives? Crowdsourced translation is something we’re considering However even if successful it will barely dent our c. 20 million metadata records!
Challenge: Localisation Currently we use our own network of Europeana partner institutions and have volunteers there Problem with scale! Solution? Larger translation communities e.g. TranslateWiki
Any questions? This poster by an unknown artist is courtesy of the Municipal Library of Lyon The work is in the public domain Slides 2-5 are taken from the Europeana Strategic Plan