SlideShare una empresa de Scribd logo
1 de 11
Descargar para leer sin conexión
(Semantic) Interoperability 
in the 
CLARIN Infrastructure 
Menzo Windhouwer 
The Language Archive - DANS 
menzo.windhouwer@dans.knaw.nl 
Succeed Interoperability Workshop 
2nd October 2014 
KB, The Hague, Netherlands
CLARIN 
 CLARIN = Common Language Resources and Technology 
Infrastructure = an european ESFRI infrastructure project 
 Aims at providing easy and sustainable access for scholars 
in the humanities and social sciences to digital language 
data (in written, spoken, video or multimodal form) and 
advanced tools to discover, explore, exploit, annotate, 
analyze or combine them, independent of where they are 
located. 
http://www.clarin.eu/
CLARIN & Interoperability 
 CLARIN Centre Committee 
 Sets technical standards for the infrastructure to operate 
 CLARIN Standards Committee 
 Standards/Recommendations for linguistic resources 
 http://clarin.ids-mannheim.de/standards/ 
 Various (national) coordinators 
 Recommendations and best practices for specific topics 
 CMDI and ISOcat 
 My own focus: achieve a level of semantic interoperability
CLARIN and Semantics 
Metadata Concept Registry 
Resources (aka data) 
 (meta)data schema building blocks have semantics 
 Make semantics explicit and share them, i.e., identify common (emerging) concepts 
 The concept registries used by CLARIN are lightweight 
 CLARIN doesn’t try to construct a ‘global’ domain ontology 
 Relationships are on the CLARIN scale suitable for resource discovery, not for formal reasoning 
 Technology 
 Resources are taken as they are, i.e., no CLARIN scale conversion to RDF 
 Metadata is XML-based, but some RDF experiments are underway
Component Metadata Infrastructure 
OAI-PMH 
Data provider 
OAI-PMH 
Service provider 
Local 
metadata 
repository 
Joint 
metadata 
repository 
metadata 
modeler 
metadata 
user 
metadata 
creator 
component 
registry & 
editor 
metadata 
editor 
metadata 
curator 
metadata 
curator 
metadata 
catalogue 
Relation 
Registry 
search & 
semantic 
mapping 
DATA 
ISOcat
CMDI 
Concept 
Registry 
Name 
Age 
Actor Sex (male, female) 
Technical 
Metadata 
Metadata Profile 
Language 
Sample frequency 
Format 
Size 
Language 
Name 
Id (aaa … zzj) 
Location 
Continent 
Country 
Address 
Project 
Name 
Contact
ISOcat Concept Registry 
http://www.isocat.org/
CMDI: Component Editor 
http://catalog.clarin.eu/ds/ComponentRegistry/#
CMDI: Semantic Mapping 
http://clarin.aac.ac.at/smc-browser
Virtual Language Observatory 
http://catalog.clarin.eu/vlo/
Good vs Bad & the future 
+ The CMD Infrastructure is very flexible with regard to metadata 
structures, but also provides an integrated light weight semantic layer to 
achieve semantic interoperability and overcome the structural 
differences  
 But sometimes more context needs to be taken into account 
- Do deal with emerging semantics and specific needs many registries 
are open, which leads to proliferation  
- The ISOcat registry operated together with ISO TC 37 is too complex  
 Move to a simpler model based on SKOS 
 Move to a simpler workflow based on CLARIN recommendations 
 The organisation problem is harder than the technical one! 
 CLARIN-NL experimented with the same kind of semantic 
interoperability for linguistic resources 
 Might be exploited by a higher level of the CLARIN Federated Content 
Search (currently supports full text search) 
 How to make semantic annotation a part of a researcher’s workflow? 
 Embed interaction with the registries in their tools

Más contenido relacionado

Similar a 8. (Semantic Interoperability in the CLARIN infrastructure. Menzo Windhouwer. larin

Semantic Mapping in CLARIN Component Metadata.
Semantic Mapping in CLARIN Component Metadata.Semantic Mapping in CLARIN Component Metadata.
Semantic Mapping in CLARIN Component Metadata.Menzo Windhouwer
 
Digital Libraries
Digital LibrariesDigital Libraries
Digital LibrariesJack Eapen
 
Digital Libraries
Digital LibrariesDigital Libraries
Digital LibrariesJack Eapen
 
How to Find a Needle in the Haystack
How to Find a Needle in the HaystackHow to Find a Needle in the Haystack
How to Find a Needle in the HaystackAdrian Stevenson
 
Choosing the right software for your research study : an overview of leading ...
Choosing the right software for your research study : an overview of leading ...Choosing the right software for your research study : an overview of leading ...
Choosing the right software for your research study : an overview of leading ...Merlien Institute
 
Harmony project - JISC Synthesis meeting 2001
Harmony project - JISC Synthesis meeting 2001Harmony project - JISC Synthesis meeting 2001
Harmony project - JISC Synthesis meeting 2001Dan Brickley
 
Poster: Using Open Source Tools to Improve Access to Oral History Collections
Poster: Using Open Source Tools to Improve Access to Oral History CollectionsPoster: Using Open Source Tools to Improve Access to Oral History Collections
Poster: Using Open Source Tools to Improve Access to Oral History CollectionsBecky Yoose
 
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...Andrea Scharnhorst
 
D4 science scientific data infrastructure promoting interoperability by embra...
D4 science scientific data infrastructure promoting interoperability by embra...D4 science scientific data infrastructure promoting interoperability by embra...
D4 science scientific data infrastructure promoting interoperability by embra...FAO
 
D4Science scientific data infrastructure promoting interoperability by embrac...
D4Science scientific data infrastructure promoting interoperability by embrac...D4Science scientific data infrastructure promoting interoperability by embrac...
D4Science scientific data infrastructure promoting interoperability by embrac...FAO
 
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...vty
 
Presentation of Mediamap @Ebu Production Technology Seminar
Presentation of Mediamap @Ebu Production Technology SeminarPresentation of Mediamap @Ebu Production Technology Seminar
Presentation of Mediamap @Ebu Production Technology SeminarMaarten Verwaest
 
Intro to Digitization Projects
Intro to Digitization ProjectsIntro to Digitization Projects
Intro to Digitization Projectszsrlibrary
 
Amersfoort 2016 koch_wg_v02
Amersfoort 2016 koch_wg_v02Amersfoort 2016 koch_wg_v02
Amersfoort 2016 koch_wg_v02walter koch
 
Corrib.org - OpenSource and Research
Corrib.org - OpenSource and ResearchCorrib.org - OpenSource and Research
Corrib.org - OpenSource and Researchadameq
 
A Distributed Audio Personalization Framework over Android
A Distributed Audio Personalization Framework over AndroidA Distributed Audio Personalization Framework over Android
A Distributed Audio Personalization Framework over AndroidUniversity of Piraeus
 
CLARIN CMDI support in Dataverse
CLARIN CMDI support in Dataverse CLARIN CMDI support in Dataverse
CLARIN CMDI support in Dataverse vty
 
Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things PayamBarnaghi
 

Similar a 8. (Semantic Interoperability in the CLARIN infrastructure. Menzo Windhouwer. larin (20)

Semantic Mapping in CLARIN Component Metadata.
Semantic Mapping in CLARIN Component Metadata.Semantic Mapping in CLARIN Component Metadata.
Semantic Mapping in CLARIN Component Metadata.
 
Digital Libraries
Digital LibrariesDigital Libraries
Digital Libraries
 
Digital Libraries
Digital LibrariesDigital Libraries
Digital Libraries
 
NISO/DCMI Webinar: Metadata for Public Sector Administration
NISO/DCMI Webinar: Metadata for Public Sector AdministrationNISO/DCMI Webinar: Metadata for Public Sector Administration
NISO/DCMI Webinar: Metadata for Public Sector Administration
 
How to Find a Needle in the Haystack
How to Find a Needle in the HaystackHow to Find a Needle in the Haystack
How to Find a Needle in the Haystack
 
Choosing the right software for your research study : an overview of leading ...
Choosing the right software for your research study : an overview of leading ...Choosing the right software for your research study : an overview of leading ...
Choosing the right software for your research study : an overview of leading ...
 
Harmony project - JISC Synthesis meeting 2001
Harmony project - JISC Synthesis meeting 2001Harmony project - JISC Synthesis meeting 2001
Harmony project - JISC Synthesis meeting 2001
 
Poster: Using Open Source Tools to Improve Access to Oral History Collections
Poster: Using Open Source Tools to Improve Access to Oral History CollectionsPoster: Using Open Source Tools to Improve Access to Oral History Collections
Poster: Using Open Source Tools to Improve Access to Oral History Collections
 
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and the...
 
D4 science scientific data infrastructure promoting interoperability by embra...
D4 science scientific data infrastructure promoting interoperability by embra...D4 science scientific data infrastructure promoting interoperability by embra...
D4 science scientific data infrastructure promoting interoperability by embra...
 
D4Science scientific data infrastructure promoting interoperability by embrac...
D4Science scientific data infrastructure promoting interoperability by embrac...D4Science scientific data infrastructure promoting interoperability by embrac...
D4Science scientific data infrastructure promoting interoperability by embrac...
 
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
Flexibility in Metadata Schemes and Standardisation: the Case of CMDI and DAN...
 
Presentation of Mediamap @Ebu Production Technology Seminar
Presentation of Mediamap @Ebu Production Technology SeminarPresentation of Mediamap @Ebu Production Technology Seminar
Presentation of Mediamap @Ebu Production Technology Seminar
 
Intro to Digitization Projects
Intro to Digitization ProjectsIntro to Digitization Projects
Intro to Digitization Projects
 
Amersfoort 2016 koch_wg_v02
Amersfoort 2016 koch_wg_v02Amersfoort 2016 koch_wg_v02
Amersfoort 2016 koch_wg_v02
 
Corrib.org - OpenSource and Research
Corrib.org - OpenSource and ResearchCorrib.org - OpenSource and Research
Corrib.org - OpenSource and Research
 
A Distributed Audio Personalization Framework over Android
A Distributed Audio Personalization Framework over AndroidA Distributed Audio Personalization Framework over Android
A Distributed Audio Personalization Framework over Android
 
It's all semantics! -The premises and promises of the semantic web
It's all semantics! -The premises and promises of the semantic webIt's all semantics! -The premises and promises of the semantic web
It's all semantics! -The premises and promises of the semantic web
 
CLARIN CMDI support in Dataverse
CLARIN CMDI support in Dataverse CLARIN CMDI support in Dataverse
CLARIN CMDI support in Dataverse
 
Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things
 

Más de IMPACT Centre of Competence

Más de IMPACT Centre of Competence (20)

Session6 01.helmut schmid
Session6 01.helmut schmidSession6 01.helmut schmid
Session6 01.helmut schmid
 
Session1 03.hsian-an wang
Session1 03.hsian-an wangSession1 03.hsian-an wang
Session1 03.hsian-an wang
 
Session7 03.katrien depuydt
Session7 03.katrien depuydtSession7 03.katrien depuydt
Session7 03.katrien depuydt
 
Session7 02.peter kiraly
Session7 02.peter kiralySession7 02.peter kiraly
Session7 02.peter kiraly
 
Session6 04.giuseppe celano
Session6 04.giuseppe celanoSession6 04.giuseppe celano
Session6 04.giuseppe celano
 
Session6 03.sandra young
Session6 03.sandra youngSession6 03.sandra young
Session6 03.sandra young
 
Session6 02.jeremi ochab
Session6 02.jeremi ochabSession6 02.jeremi ochab
Session6 02.jeremi ochab
 
Session5 04.evangelos varthis
Session5 04.evangelos varthisSession5 04.evangelos varthis
Session5 04.evangelos varthis
 
Session5 03.george rehm
Session5 03.george rehmSession5 03.george rehm
Session5 03.george rehm
 
Session5 02.tom derrick
Session5 02.tom derrickSession5 02.tom derrick
Session5 02.tom derrick
 
Session5 01.rutger vankoert
Session5 01.rutger vankoertSession5 01.rutger vankoert
Session5 01.rutger vankoert
 
Session4 04.senka drobac
Session4 04.senka drobacSession4 04.senka drobac
Session4 04.senka drobac
 
Session3 04.arnau baro
Session3 04.arnau baroSession3 04.arnau baro
Session3 04.arnau baro
 
Session3 03.christian clausner
Session3 03.christian clausnerSession3 03.christian clausner
Session3 03.christian clausner
 
Session3 02.kimmo ketunnen
Session3 02.kimmo ketunnenSession3 02.kimmo ketunnen
Session3 02.kimmo ketunnen
 
Session3 01.clemens neudecker
Session3 01.clemens neudeckerSession3 01.clemens neudecker
Session3 01.clemens neudecker
 
Session2 04.ashkan ashkpour
Session2 04.ashkan ashkpourSession2 04.ashkan ashkpour
Session2 04.ashkan ashkpour
 
Session2 03.juri opitz
Session2 03.juri opitzSession2 03.juri opitz
Session2 03.juri opitz
 
Session2 02.christian reul
Session2 02.christian reulSession2 02.christian reul
Session2 02.christian reul
 
Session2 01.emad mohamed
Session2 01.emad mohamedSession2 01.emad mohamed
Session2 01.emad mohamed
 

Último

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 

Último (20)

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 

8. (Semantic Interoperability in the CLARIN infrastructure. Menzo Windhouwer. larin

  • 1. (Semantic) Interoperability in the CLARIN Infrastructure Menzo Windhouwer The Language Archive - DANS menzo.windhouwer@dans.knaw.nl Succeed Interoperability Workshop 2nd October 2014 KB, The Hague, Netherlands
  • 2. CLARIN  CLARIN = Common Language Resources and Technology Infrastructure = an european ESFRI infrastructure project  Aims at providing easy and sustainable access for scholars in the humanities and social sciences to digital language data (in written, spoken, video or multimodal form) and advanced tools to discover, explore, exploit, annotate, analyze or combine them, independent of where they are located. http://www.clarin.eu/
  • 3. CLARIN & Interoperability  CLARIN Centre Committee  Sets technical standards for the infrastructure to operate  CLARIN Standards Committee  Standards/Recommendations for linguistic resources  http://clarin.ids-mannheim.de/standards/  Various (national) coordinators  Recommendations and best practices for specific topics  CMDI and ISOcat  My own focus: achieve a level of semantic interoperability
  • 4. CLARIN and Semantics Metadata Concept Registry Resources (aka data)  (meta)data schema building blocks have semantics  Make semantics explicit and share them, i.e., identify common (emerging) concepts  The concept registries used by CLARIN are lightweight  CLARIN doesn’t try to construct a ‘global’ domain ontology  Relationships are on the CLARIN scale suitable for resource discovery, not for formal reasoning  Technology  Resources are taken as they are, i.e., no CLARIN scale conversion to RDF  Metadata is XML-based, but some RDF experiments are underway
  • 5. Component Metadata Infrastructure OAI-PMH Data provider OAI-PMH Service provider Local metadata repository Joint metadata repository metadata modeler metadata user metadata creator component registry & editor metadata editor metadata curator metadata curator metadata catalogue Relation Registry search & semantic mapping DATA ISOcat
  • 6. CMDI Concept Registry Name Age Actor Sex (male, female) Technical Metadata Metadata Profile Language Sample frequency Format Size Language Name Id (aaa … zzj) Location Continent Country Address Project Name Contact
  • 7. ISOcat Concept Registry http://www.isocat.org/
  • 8. CMDI: Component Editor http://catalog.clarin.eu/ds/ComponentRegistry/#
  • 9. CMDI: Semantic Mapping http://clarin.aac.ac.at/smc-browser
  • 10. Virtual Language Observatory http://catalog.clarin.eu/vlo/
  • 11. Good vs Bad & the future + The CMD Infrastructure is very flexible with regard to metadata structures, but also provides an integrated light weight semantic layer to achieve semantic interoperability and overcome the structural differences   But sometimes more context needs to be taken into account - Do deal with emerging semantics and specific needs many registries are open, which leads to proliferation  - The ISOcat registry operated together with ISO TC 37 is too complex   Move to a simpler model based on SKOS  Move to a simpler workflow based on CLARIN recommendations  The organisation problem is harder than the technical one!  CLARIN-NL experimented with the same kind of semantic interoperability for linguistic resources  Might be exploited by a higher level of the CLARIN Federated Content Search (currently supports full text search)  How to make semantic annotation a part of a researcher’s workflow?  Embed interaction with the registries in their tools