SlideShare una empresa de Scribd logo
1 de 19
Descargar para leer sin conexión
Reusing Linguistic Resources
Tasks and Goals for a Linked Data Approach


              Marieke van Erp
              marieke@cs.vu.nl



                    LDL2012
Introduction

• BA, MA & PhD compling/
  information extraction
  @Tilburg University

• Since 2009: SemWeb group
  @VU University Amsterdam
Why Reuse Linguistic
                   Resources?
• Linguistic resources are
    expensive to create
•   ...and difficult to use for
    ‘outsiders’

• How can we reach out to the
    ‘outside world’?



                                 Image Source: http://cyberbrethren.com/wp-content/uploads/2012/02/language1.jp
Make reuse easier!


• Increased visibility
• Social value:
     • stimulates collaboration
     • accelerates innovation
• External quality control




                                  Image Source: http://th02.deviantart.net/fs71/PRE/i/2010/146/b/3/
                                            DON__T_PANIC_by_VigilantMeadow.jpg
What’s holding us back?



• Fear?
• Habit?




             Image Source: h http://mindfulbalance.files.wordpress.com/2011/02/hesitate1.jpg
Practical Constraints

1. Task specificity
2. Formats
3. Different conceptual
   models
4. No machine-readable
   definitions
5. Lack of metadata



                          Image Source: http://bogdankipko.com/wp-content/uploads/2011/12/barriers.jpg
1. Task-specificity


• Resources are often geared
  towards one specific task
  e.g., part-of-speech tagging,
  named entity recognition

• How can we make our
  resources more flexible?



                                  Image Source: http://thelearnersguild.files.wordpress.com/2008/07/the-informal-
                                                                learners-toolkit1.jpg
2. Formats

• XML, inline XML, CSV, one
  word per line, one sentence
  per line, slashtags, ARFF,




                                Image Source: http://www.elec-intro.com/EX/05-13-03/kf_compact_data.jpg
3. Conceptual Models
• An NP is an NP is an NP?
• “President Obama signed the
  National Defense
  Authorization Act after
  months of debate”
  • NE: “President Obama”?
  • NE: “Obama”?

                                Image Source: http://www.w3.org/2001/sw/BestPractices/WNET/wordnet-
                                                        sw-20040713-fig01.png
4. Lack of Machine-
               Readable definitions
• For integration or reuse
  manual effort is needed
  • time consuming
  • difficult to track definitions
  • not scalable




                                   Image Source: http://www.barcode1.co.uk/images/samplejplarge.jpg
5. Lack of Metadata

• Can I trust this data provider?
• How was this data created?
• How many annotators?
  • for the entire data set?
  • per instance?
• If generated automatically,
  what were the parameters?



                                    Image Source: http://darwin-online.org.uk/converted/published/
                                           1859_Origin_F373/1859_Origin_F373_fig02.jpg
A Linked Data Approach
• Linked Data is not a magic
  solution to all problems

• ...but it is better than what
  we’ve got at this moment




                                  Image Source: http://linkeddata.org/static/images/lod-
                                          datasets_2009-07-14_cropped.png
1. Using RDF

• RDF is not inherently better
  than some other formats, but
  it is used by many

• + SPARQL makes it easy to
  retrieve data



                                 Image Source: http://www.247ha.com/images/rdf.jpg
2. Mapping Annotations
• A single conceptual
  model for all linguistic
  resources is not going
  to happen

• ...but can we spot the
  similarities between
  models and utilise
  that?


                             Image Source:http://www.webology.org/2006/v3n3/images/sample.JPG
3. Grounding
• It’s only linked data if you link
  it to other sources

• Added bonus: automatic
  sense disambiguation + access
  to a wealth of extra
  knowledge about your data
  item


                                      Image Source: http://mj-services.com/wallpaper/More_WallPaper/Trees/Giants,
                                        %20Calaveras%20State%20Park%20-%201600x1200%20-%20ID%2015.jpg
4. Define Your Metadata
• Include your data model
• Preferably give each instance’s
  provenance
    • collection
    • annotation/creation
    • previous versions
    • confidence


                                    Image Source: http://www.wineaustralia.com/australia/Portals/2/November%20E-
                                                     news/Wines%20of%20Provenance%20Final.jpg
Conclusions
• Look for similarities between
    resources
•   Say where your resource
    comes from
•   Use standards, or make it
    easy for others to convert
    your data to a standard
•   Link to other data


                                  Image Source: http://efr0702.files.wordpress.com/2012/02/puzzle.jpg
Questions?



marieke@cs.vu.nl
http://www.cs.vu.nl/~marieke        Image Source: http://www.amichelleblakeley.com/storage/question%20marks.jpg?
                                               __SQUARESPACE_CACHEVERSION=1295297003883
Acknowledgment

• This work is funded by
  NWO in the CATCH
  programme, grant
  640.004.801

Más contenido relacionado

La actualidad más candente

Opening What's Closed: Using Open Source Tools to Tear Down [Vendor] Silos
Opening What's Closed: Using Open Source Tools to Tear Down [Vendor] SilosOpening What's Closed: Using Open Source Tools to Tear Down [Vendor] Silos
Opening What's Closed: Using Open Source Tools to Tear Down [Vendor] SilosKen Varnum
 
Troubleshooting Electronic Resources with ILL Data
Troubleshooting Electronic Resources with ILL DataTroubleshooting Electronic Resources with ILL Data
Troubleshooting Electronic Resources with ILL DataNASIG
 
Troubleshooting electronic resources with ILL data
Troubleshooting electronic resources with ILL dataTroubleshooting electronic resources with ILL data
Troubleshooting electronic resources with ILL dataBeth Ashmore
 
Keeping up to date
Keeping up to dateKeeping up to date
Keeping up to dateKara Jones
 
Libraries, OA research and OER: towards symbiosis?
Libraries, OA research and OER: towards symbiosis?Libraries, OA research and OER: towards symbiosis?
Libraries, OA research and OER: towards symbiosis?Nick Sheppard
 
OER for repository managers
OER for repository managersOER for repository managers
OER for repository managersNick Sheppard
 

La actualidad más candente (7)

Escaping Datageddon
Escaping DatageddonEscaping Datageddon
Escaping Datageddon
 
Opening What's Closed: Using Open Source Tools to Tear Down [Vendor] Silos
Opening What's Closed: Using Open Source Tools to Tear Down [Vendor] SilosOpening What's Closed: Using Open Source Tools to Tear Down [Vendor] Silos
Opening What's Closed: Using Open Source Tools to Tear Down [Vendor] Silos
 
Troubleshooting Electronic Resources with ILL Data
Troubleshooting Electronic Resources with ILL DataTroubleshooting Electronic Resources with ILL Data
Troubleshooting Electronic Resources with ILL Data
 
Troubleshooting electronic resources with ILL data
Troubleshooting electronic resources with ILL dataTroubleshooting electronic resources with ILL data
Troubleshooting electronic resources with ILL data
 
Keeping up to date
Keeping up to dateKeeping up to date
Keeping up to date
 
Libraries, OA research and OER: towards symbiosis?
Libraries, OA research and OER: towards symbiosis?Libraries, OA research and OER: towards symbiosis?
Libraries, OA research and OER: towards symbiosis?
 
OER for repository managers
OER for repository managersOER for repository managers
OER for repository managers
 

Destacado

Automatic Heritage Metadata Enrichment with Historic Events
Automatic Heritage Metadata Enrichment with Historic Events Automatic Heritage Metadata Enrichment with Historic Events
Automatic Heritage Metadata Enrichment with Historic Events Marieke van Erp
 
Agora: putting museum objects into their art-historic context
Agora: putting museum objects into their art-historic contextAgora: putting museum objects into their art-historic context
Agora: putting museum objects into their art-historic contextMarieke van Erp
 
NewsReader: Automating detective work
NewsReader: Automating detective workNewsReader: Automating detective work
NewsReader: Automating detective workMarieke van Erp
 
Knowledge and Media 2012 Lecture 10: Research proposal QA
Knowledge and Media 2012 Lecture 10: Research proposal QAKnowledge and Media 2012 Lecture 10: Research proposal QA
Knowledge and Media 2012 Lecture 10: Research proposal QAMarieke van Erp
 
Lecture 5: Mining, Analysis and Visualisation
Lecture 5: Mining, Analysis and VisualisationLecture 5: Mining, Analysis and Visualisation
Lecture 5: Mining, Analysis and VisualisationMarieke van Erp
 
KM2012 Lecture 1: introduction
KM2012 Lecture 1: introductionKM2012 Lecture 1: introduction
KM2012 Lecture 1: introductionMarieke van Erp
 
Automatic Extraction of Soccer Game Event Data from Twitter
Automatic Extraction of Soccer Game Event Data from TwitterAutomatic Extraction of Soccer Game Event Data from Twitter
Automatic Extraction of Soccer Game Event Data from TwitterMarieke van Erp
 

Destacado (13)

Automatic Heritage Metadata Enrichment with Historic Events
Automatic Heritage Metadata Enrichment with Historic Events Automatic Heritage Metadata Enrichment with Historic Events
Automatic Heritage Metadata Enrichment with Historic Events
 
Agora: putting museum objects into their art-historic context
Agora: putting museum objects into their art-historic contextAgora: putting museum objects into their art-historic context
Agora: putting museum objects into their art-historic context
 
KM Lecture 7 LOD
KM Lecture 7 LODKM Lecture 7 LOD
KM Lecture 7 LOD
 
Agora User Interviews
Agora User InterviewsAgora User Interviews
Agora User Interviews
 
Richness oftheworld2012
Richness oftheworld2012Richness oftheworld2012
Richness oftheworld2012
 
NewsReader: Automating detective work
NewsReader: Automating detective workNewsReader: Automating detective work
NewsReader: Automating detective work
 
Knowledge and Media 2012 Lecture 10: Research proposal QA
Knowledge and Media 2012 Lecture 10: Research proposal QAKnowledge and Media 2012 Lecture 10: Research proposal QA
Knowledge and Media 2012 Lecture 10: Research proposal QA
 
DeRiVE opening
DeRiVE openingDeRiVE opening
DeRiVE opening
 
Lecture 5: Mining, Analysis and Visualisation
Lecture 5: Mining, Analysis and VisualisationLecture 5: Mining, Analysis and Visualisation
Lecture 5: Mining, Analysis and Visualisation
 
KM2012 Lecture 1: introduction
KM2012 Lecture 1: introductionKM2012 Lecture 1: introduction
KM2012 Lecture 1: introduction
 
2 ontologies I
2 ontologies I2 ontologies I
2 ontologies I
 
KM Lecture11 nlp/nif
KM Lecture11 nlp/nifKM Lecture11 nlp/nif
KM Lecture11 nlp/nif
 
Automatic Extraction of Soccer Game Event Data from Twitter
Automatic Extraction of Soccer Game Event Data from TwitterAutomatic Extraction of Soccer Game Event Data from Twitter
Automatic Extraction of Soccer Game Event Data from Twitter
 

Similar a Ldl2012

How to build a better mousetrap final
How to build a better mousetrap finalHow to build a better mousetrap final
How to build a better mousetrap finalJeannie Castro
 
The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...Projeto RCAAP
 
A tour of the library of the future
A tour of the library of the futureA tour of the library of the future
A tour of the library of the futureBethan Ruddock
 
Exploring the Semantic Web
Exploring the Semantic WebExploring the Semantic Web
Exploring the Semantic WebRoberto García
 
Acpet Moodle from Scratch Version 2
Acpet Moodle from Scratch Version 2Acpet Moodle from Scratch Version 2
Acpet Moodle from Scratch Version 2Yum Studio
 
Introduction to APIs and Linked Data
Introduction to APIs and Linked DataIntroduction to APIs and Linked Data
Introduction to APIs and Linked DataAdrian Stevenson
 
Preventing data loss
Preventing data lossPreventing data loss
Preventing data lossIUPUI
 
Putting Linked Data to Use in a Large Higher-Education Organisation
Putting Linked Data to Use in a Large Higher-Education OrganisationPutting Linked Data to Use in a Large Higher-Education Organisation
Putting Linked Data to Use in a Large Higher-Education OrganisationMathieu d'Aquin
 
Large scale computing
Large scale computing Large scale computing
Large scale computing Bhupesh Bansal
 
Rapid eLearning
Rapid eLearning Rapid eLearning
Rapid eLearning Yum Studio
 
UCISA Learning Anaytics Pre-Conference Workshop
UCISA Learning Anaytics Pre-Conference WorkshopUCISA Learning Anaytics Pre-Conference Workshop
UCISA Learning Anaytics Pre-Conference WorkshopMike Moore
 
Provenance Management to Enable Data Sharing
Provenance Management to Enable Data SharingProvenance Management to Enable Data Sharing
Provenance Management to Enable Data SharingUniversity of Arizona
 
Linked Energy Data Generation
Linked Energy Data GenerationLinked Energy Data Generation
Linked Energy Data GenerationFilip Radulovic
 
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Lucidworks
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningJoaquin Delgado PhD.
 

Similar a Ldl2012 (20)

How to build a better mousetrap final
How to build a better mousetrap finalHow to build a better mousetrap final
How to build a better mousetrap final
 
The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...The state of global research data initiatives: observations from a life on th...
The state of global research data initiatives: observations from a life on th...
 
DataUp at ACRL 2013
DataUp at ACRL 2013DataUp at ACRL 2013
DataUp at ACRL 2013
 
kaggle_meet_up
kaggle_meet_upkaggle_meet_up
kaggle_meet_up
 
NISO Webinar: Library Linked Data: From Vision to Reality
NISO Webinar: Library Linked Data: From Vision to RealityNISO Webinar: Library Linked Data: From Vision to Reality
NISO Webinar: Library Linked Data: From Vision to Reality
 
A tour of the library of the future
A tour of the library of the futureA tour of the library of the future
A tour of the library of the future
 
Exploring the Semantic Web
Exploring the Semantic WebExploring the Semantic Web
Exploring the Semantic Web
 
Acpet Moodle from Scratch Version 2
Acpet Moodle from Scratch Version 2Acpet Moodle from Scratch Version 2
Acpet Moodle from Scratch Version 2
 
Designing e-Learning Objects
Designing e-Learning ObjectsDesigning e-Learning Objects
Designing e-Learning Objects
 
Introduction to APIs and Linked Data
Introduction to APIs and Linked DataIntroduction to APIs and Linked Data
Introduction to APIs and Linked Data
 
Preventing data loss
Preventing data lossPreventing data loss
Preventing data loss
 
Putting Linked Data to Use in a Large Higher-Education Organisation
Putting Linked Data to Use in a Large Higher-Education OrganisationPutting Linked Data to Use in a Large Higher-Education Organisation
Putting Linked Data to Use in a Large Higher-Education Organisation
 
Infographics - 2012 E3 Conestoga
Infographics - 2012 E3 ConestogaInfographics - 2012 E3 Conestoga
Infographics - 2012 E3 Conestoga
 
Large scale computing
Large scale computing Large scale computing
Large scale computing
 
Rapid eLearning
Rapid eLearning Rapid eLearning
Rapid eLearning
 
UCISA Learning Anaytics Pre-Conference Workshop
UCISA Learning Anaytics Pre-Conference WorkshopUCISA Learning Anaytics Pre-Conference Workshop
UCISA Learning Anaytics Pre-Conference Workshop
 
Provenance Management to Enable Data Sharing
Provenance Management to Enable Data SharingProvenance Management to Enable Data Sharing
Provenance Management to Enable Data Sharing
 
Linked Energy Data Generation
Linked Energy Data GenerationLinked Energy Data Generation
Linked Energy Data Generation
 
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado,...
 
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine LearningLucene/Solr Revolution 2015: Where Search Meets Machine Learning
Lucene/Solr Revolution 2015: Where Search Meets Machine Learning
 

Más de Marieke van Erp

Towards Culturally Aware AI Systems - TSDH Symposium
Towards Culturally Aware AI Systems - TSDH SymposiumTowards Culturally Aware AI Systems - TSDH Symposium
Towards Culturally Aware AI Systems - TSDH SymposiumMarieke van Erp
 
A Polyvocal and Contextualised Semantic Web
A Polyvocal and Contextualised Semantic WebA Polyvocal and Contextualised Semantic Web
A Polyvocal and Contextualised Semantic WebMarieke van Erp
 
AI x Digital Humanities = > Inclusiviteit
AI x Digital Humanities = > Inclusiviteit AI x Digital Humanities = > Inclusiviteit
AI x Digital Humanities = > Inclusiviteit Marieke van Erp
 
Computationally Tracing Concepts Through Time and Space
Computationally Tracing Concepts Through Time and SpaceComputationally Tracing Concepts Through Time and Space
Computationally Tracing Concepts Through Time and SpaceMarieke van Erp
 
The Hitchhiker's Guide to the Future of Digital Humanities
The Hitchhiker's Guide to the Future of Digital HumanitiesThe Hitchhiker's Guide to the Future of Digital Humanities
The Hitchhiker's Guide to the Future of Digital HumanitiesMarieke van Erp
 
Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)Marieke van Erp
 
(Beyond) Combining Text and Tables for qualitative and quantitative research
(Beyond) Combining Text and Tables for qualitative and quantitative research (Beyond) Combining Text and Tables for qualitative and quantitative research
(Beyond) Combining Text and Tables for qualitative and quantitative research Marieke van Erp
 
Finding common ground between text, maps, and tables for quantitative and qua...
Finding common ground between text, maps, and tables for quantitative and qua...Finding common ground between text, maps, and tables for quantitative and qua...
Finding common ground between text, maps, and tables for quantitative and qua...Marieke van Erp
 
Slicing and Dicing a Newspaper Corpus for Historical Ecology Research
Slicing and Dicing a Newspaper Corpus for Historical Ecology ResearchSlicing and Dicing a Newspaper Corpus for Historical Ecology Research
Slicing and Dicing a Newspaper Corpus for Historical Ecology ResearchMarieke van Erp
 
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...Marieke van Erp
 
Good Lynx, bad Lynx: Document enrichment for historical ecologists
Good Lynx, bad Lynx: Document enrichment for historical ecologistsGood Lynx, bad Lynx: Document enrichment for historical ecologists
Good Lynx, bad Lynx: Document enrichment for historical ecologistsMarieke van Erp
 
Towards Semantic Enrichment of Newspapers: a historical ecology use case
Towards Semantic Enrichment of Newspapers: a historical ecology use case Towards Semantic Enrichment of Newspapers: a historical ecology use case
Towards Semantic Enrichment of Newspapers: a historical ecology use case Marieke van Erp
 
Natural Language Processing en Named Entity Recognition
Natural Language Processing en Named Entity Recognition Natural Language Processing en Named Entity Recognition
Natural Language Processing en Named Entity Recognition Marieke van Erp
 
HuC lecture - Digital and Humanities: Continuing the Conversation
HuC lecture - Digital and Humanities: Continuing the ConversationHuC lecture - Digital and Humanities: Continuing the Conversation
HuC lecture - Digital and Humanities: Continuing the ConversationMarieke van Erp
 
Multilingual Fine-grained Entity Typing
Multilingual Fine-grained Entity Typing Multilingual Fine-grained Entity Typing
Multilingual Fine-grained Entity Typing Marieke van Erp
 
Entity Typing Using Distributional Semantics and DBpedia
Entity Typing Using Distributional Semantics and DBpedia Entity Typing Using Distributional Semantics and DBpedia
Entity Typing Using Distributional Semantics and DBpedia Marieke van Erp
 
Entity Typing and Event Extraction
Entity Typing and Event Extraction Entity Typing and Event Extraction
Entity Typing and Event Extraction Marieke van Erp
 
The domain as unifier, how focusing on social history can bring technical fie...
The domain as unifier, how focusing on social history can bring technical fie...The domain as unifier, how focusing on social history can bring technical fie...
The domain as unifier, how focusing on social history can bring technical fie...Marieke van Erp
 
Evaluating entity linking an analysis of current benchmark datasets and a ro...
Evaluating entity linking  an analysis of current benchmark datasets and a ro...Evaluating entity linking  an analysis of current benchmark datasets and a ro...
Evaluating entity linking an analysis of current benchmark datasets and a ro...Marieke van Erp
 
Finding Stories in 1,784,532 Events: Scaling up computational models of narr...
Finding Stories in 1,784,532 Events:  Scaling up computational models of narr...Finding Stories in 1,784,532 Events:  Scaling up computational models of narr...
Finding Stories in 1,784,532 Events: Scaling up computational models of narr...Marieke van Erp
 

Más de Marieke van Erp (20)

Towards Culturally Aware AI Systems - TSDH Symposium
Towards Culturally Aware AI Systems - TSDH SymposiumTowards Culturally Aware AI Systems - TSDH Symposium
Towards Culturally Aware AI Systems - TSDH Symposium
 
A Polyvocal and Contextualised Semantic Web
A Polyvocal and Contextualised Semantic WebA Polyvocal and Contextualised Semantic Web
A Polyvocal and Contextualised Semantic Web
 
AI x Digital Humanities = > Inclusiviteit
AI x Digital Humanities = > Inclusiviteit AI x Digital Humanities = > Inclusiviteit
AI x Digital Humanities = > Inclusiviteit
 
Computationally Tracing Concepts Through Time and Space
Computationally Tracing Concepts Through Time and SpaceComputationally Tracing Concepts Through Time and Space
Computationally Tracing Concepts Through Time and Space
 
The Hitchhiker's Guide to the Future of Digital Humanities
The Hitchhiker's Guide to the Future of Digital HumanitiesThe Hitchhiker's Guide to the Future of Digital Humanities
The Hitchhiker's Guide to the Future of Digital Humanities
 
Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)
 
(Beyond) Combining Text and Tables for qualitative and quantitative research
(Beyond) Combining Text and Tables for qualitative and quantitative research (Beyond) Combining Text and Tables for qualitative and quantitative research
(Beyond) Combining Text and Tables for qualitative and quantitative research
 
Finding common ground between text, maps, and tables for quantitative and qua...
Finding common ground between text, maps, and tables for quantitative and qua...Finding common ground between text, maps, and tables for quantitative and qua...
Finding common ground between text, maps, and tables for quantitative and qua...
 
Slicing and Dicing a Newspaper Corpus for Historical Ecology Research
Slicing and Dicing a Newspaper Corpus for Historical Ecology ResearchSlicing and Dicing a Newspaper Corpus for Historical Ecology Research
Slicing and Dicing a Newspaper Corpus for Historical Ecology Research
 
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...
 
Good Lynx, bad Lynx: Document enrichment for historical ecologists
Good Lynx, bad Lynx: Document enrichment for historical ecologistsGood Lynx, bad Lynx: Document enrichment for historical ecologists
Good Lynx, bad Lynx: Document enrichment for historical ecologists
 
Towards Semantic Enrichment of Newspapers: a historical ecology use case
Towards Semantic Enrichment of Newspapers: a historical ecology use case Towards Semantic Enrichment of Newspapers: a historical ecology use case
Towards Semantic Enrichment of Newspapers: a historical ecology use case
 
Natural Language Processing en Named Entity Recognition
Natural Language Processing en Named Entity Recognition Natural Language Processing en Named Entity Recognition
Natural Language Processing en Named Entity Recognition
 
HuC lecture - Digital and Humanities: Continuing the Conversation
HuC lecture - Digital and Humanities: Continuing the ConversationHuC lecture - Digital and Humanities: Continuing the Conversation
HuC lecture - Digital and Humanities: Continuing the Conversation
 
Multilingual Fine-grained Entity Typing
Multilingual Fine-grained Entity Typing Multilingual Fine-grained Entity Typing
Multilingual Fine-grained Entity Typing
 
Entity Typing Using Distributional Semantics and DBpedia
Entity Typing Using Distributional Semantics and DBpedia Entity Typing Using Distributional Semantics and DBpedia
Entity Typing Using Distributional Semantics and DBpedia
 
Entity Typing and Event Extraction
Entity Typing and Event Extraction Entity Typing and Event Extraction
Entity Typing and Event Extraction
 
The domain as unifier, how focusing on social history can bring technical fie...
The domain as unifier, how focusing on social history can bring technical fie...The domain as unifier, how focusing on social history can bring technical fie...
The domain as unifier, how focusing on social history can bring technical fie...
 
Evaluating entity linking an analysis of current benchmark datasets and a ro...
Evaluating entity linking  an analysis of current benchmark datasets and a ro...Evaluating entity linking  an analysis of current benchmark datasets and a ro...
Evaluating entity linking an analysis of current benchmark datasets and a ro...
 
Finding Stories in 1,784,532 Events: Scaling up computational models of narr...
Finding Stories in 1,784,532 Events:  Scaling up computational models of narr...Finding Stories in 1,784,532 Events:  Scaling up computational models of narr...
Finding Stories in 1,784,532 Events: Scaling up computational models of narr...
 

Último

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 

Último (20)

Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

Ldl2012

  • 1. Reusing Linguistic Resources Tasks and Goals for a Linked Data Approach Marieke van Erp marieke@cs.vu.nl LDL2012
  • 2. Introduction • BA, MA & PhD compling/ information extraction @Tilburg University • Since 2009: SemWeb group @VU University Amsterdam
  • 3. Why Reuse Linguistic Resources? • Linguistic resources are expensive to create • ...and difficult to use for ‘outsiders’ • How can we reach out to the ‘outside world’? Image Source: http://cyberbrethren.com/wp-content/uploads/2012/02/language1.jp
  • 4. Make reuse easier! • Increased visibility • Social value: • stimulates collaboration • accelerates innovation • External quality control Image Source: http://th02.deviantart.net/fs71/PRE/i/2010/146/b/3/ DON__T_PANIC_by_VigilantMeadow.jpg
  • 5. What’s holding us back? • Fear? • Habit? Image Source: h http://mindfulbalance.files.wordpress.com/2011/02/hesitate1.jpg
  • 6. Practical Constraints 1. Task specificity 2. Formats 3. Different conceptual models 4. No machine-readable definitions 5. Lack of metadata Image Source: http://bogdankipko.com/wp-content/uploads/2011/12/barriers.jpg
  • 7. 1. Task-specificity • Resources are often geared towards one specific task e.g., part-of-speech tagging, named entity recognition • How can we make our resources more flexible? Image Source: http://thelearnersguild.files.wordpress.com/2008/07/the-informal- learners-toolkit1.jpg
  • 8. 2. Formats • XML, inline XML, CSV, one word per line, one sentence per line, slashtags, ARFF, Image Source: http://www.elec-intro.com/EX/05-13-03/kf_compact_data.jpg
  • 9. 3. Conceptual Models • An NP is an NP is an NP? • “President Obama signed the National Defense Authorization Act after months of debate” • NE: “President Obama”? • NE: “Obama”? Image Source: http://www.w3.org/2001/sw/BestPractices/WNET/wordnet- sw-20040713-fig01.png
  • 10. 4. Lack of Machine- Readable definitions • For integration or reuse manual effort is needed • time consuming • difficult to track definitions • not scalable Image Source: http://www.barcode1.co.uk/images/samplejplarge.jpg
  • 11. 5. Lack of Metadata • Can I trust this data provider? • How was this data created? • How many annotators? • for the entire data set? • per instance? • If generated automatically, what were the parameters? Image Source: http://darwin-online.org.uk/converted/published/ 1859_Origin_F373/1859_Origin_F373_fig02.jpg
  • 12. A Linked Data Approach • Linked Data is not a magic solution to all problems • ...but it is better than what we’ve got at this moment Image Source: http://linkeddata.org/static/images/lod- datasets_2009-07-14_cropped.png
  • 13. 1. Using RDF • RDF is not inherently better than some other formats, but it is used by many • + SPARQL makes it easy to retrieve data Image Source: http://www.247ha.com/images/rdf.jpg
  • 14. 2. Mapping Annotations • A single conceptual model for all linguistic resources is not going to happen • ...but can we spot the similarities between models and utilise that? Image Source:http://www.webology.org/2006/v3n3/images/sample.JPG
  • 15. 3. Grounding • It’s only linked data if you link it to other sources • Added bonus: automatic sense disambiguation + access to a wealth of extra knowledge about your data item Image Source: http://mj-services.com/wallpaper/More_WallPaper/Trees/Giants, %20Calaveras%20State%20Park%20-%201600x1200%20-%20ID%2015.jpg
  • 16. 4. Define Your Metadata • Include your data model • Preferably give each instance’s provenance • collection • annotation/creation • previous versions • confidence Image Source: http://www.wineaustralia.com/australia/Portals/2/November%20E- news/Wines%20of%20Provenance%20Final.jpg
  • 17. Conclusions • Look for similarities between resources • Say where your resource comes from • Use standards, or make it easy for others to convert your data to a standard • Link to other data Image Source: http://efr0702.files.wordpress.com/2012/02/puzzle.jpg
  • 18. Questions? marieke@cs.vu.nl http://www.cs.vu.nl/~marieke Image Source: http://www.amichelleblakeley.com/storage/question%20marks.jpg? __SQUARESPACE_CACHEVERSION=1295297003883
  • 19. Acknowledgment • This work is funded by NWO in the CATCH programme, grant 640.004.801