SlideShare una empresa de Scribd logo
1 de 12
Descargar para leer sin conexión
University of Economics                                                            Czech Technical University
             Prague                                                                             in Prague



           Recognizing, Classifying and Linking
           Entities with Wikipedia and DBpedia

                                                  Milan Dojchinovski1, Tomas Kliegr2
1 Faculty of Information Technology                                                 2Faculty
                                                                                           of Informatics and Statistics
Czech Technical University in Prague                                                 University of Economics, Prague


                                                                Milan Dojchinovski
                              milan.dojchinovski@fit.cvut.cz - @m1ci - http://dojchinovski.mk



                                            The 7th Workshop on Intelligent and Knowledge Oriented Technologies (WIKT 2012)
                                                                                        November 22-23, 2012, Smolenice, SK

 Except where otherwise noted, the content of this presentation is licensed under
 Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported
Overview

 ‣   Introduction

 ‣   Entity Recognition, Classification and Publication

 ‣   Experiments

 ‣   Conclusion and Future Work




Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk   2
Introduction

 ‣    Unsupervised and fully-automated:
  -    entity recognition - rule based lexico-syntactic patterns
  -    entity classification by extraction of hypernyms - targeted hypernym extraction
  -    entity linking to DBpedia concepts

 ‣    Publication as Linked Data
  -    results in NLP Interchange Format (NIF)




Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk   3
Overview

 ‣   Introduction

 ‣   Entity Recognition, Classification and Publication

 ‣   Experiments

 ‣   Conclusion and Future Work




Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk   4
Tool Architecture

 ‣   Available as Web 2.0 application at: http://ner.vse.cz/thd

 ‣   Web API available at: http://ner.vse.cz/thd/docs




                                                          Fig 1. Architecture overview




Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk   5
Entity Recognition and Classification

 ‣    Entity Recognition
  -    2 JAPE grammars: 1) NNP+ 2) JJ* NN+
  -    input: free text
  -    output: Named (e.g., Diego Maradona ) or Common Entities (e.g., hockey player )

 ‣    Entity Classification
  -    supported by the Targeted Hypernym Discovery algorithm
  -    lexico-syntactic patterns, e.g. _x_ is a _y_




Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk   6
Entity Linking and Publication

 ‣    Entity Linking
  -    linking with concepts from DBpedia
  -    used Wikipedia Search API
  -    mapping Wikipedia article URL to its DBpedia representation

 ‣    Publication in NIF
  -    NLP Interchange Format (RDF-based representation)
  -    each processed document (context) has unique identifier
  -    each entity and hypernym as offset-based string




Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk   7
Overview

 ‣   Introduction

 ‣   Entity Recognition, Classification and Publication

 ‣   Experiments

 ‣   Conclusion and Future Work




Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk   8
Experiments

 ‣   Question addressed
     -   How well our tool recognizes, classifies and links Named and Common Entities?
 ‣   Experiment setup
     -   manually created dataset, Czech Traveler Dataset
     -   101 Named Entities, 85 Common Entities
     -   comparison with 3 other systems: DBpedia Spotlight, Open Calais, Alchemy API
 ‣   Results
     -   Named Entities,
         •   f-score: recognition 0.66, classification 0.66, linking 0.58

     -   Common Entities
         •   f-score: recognition 0.60, classification 0.51, linking 0.61

     -   better results in all tasks
         •   overtaken only by DBpedia Spotlight - linking of common entities with f-score 0.69


Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk   9
Overview

 ‣   Introduction

 ‣   Entity Recognition, Classification and Publication

 ‣   Experiments

 ‣   Conclusion and Future Work




Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk   10
Conclusion and Future Work

 ‣   Tool for Entity Recognition, Classification and Publication

 ‣   Future directions
     -   multilingual support - Dutch, German and Czech language
     -   grammar improvements
     -   evaluation on a standard benchmark




Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk   11
Feedback




                                                               Thank you!
                                             Questions, comments, ideas?


                                          demo at: http://ner.vse.cz/thd

                            Milan Dojchinovski                                       @m1ci
                            milan.dojchinovski@fit.cvut.cz                            http://dojchinovski.mk

  Except where otherwise noted, the content of this presentation is licensed under
  Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported                                          12

Más contenido relacionado

Similar a Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia

Structured Data Presentation
Structured Data PresentationStructured Data Presentation
Structured Data PresentationShawn Day
 
Blockchain for Education: A Study on Digital Accreditation of Personal and Ac...
Blockchain for Education: A Study on Digital Accreditation of Personal and Ac...Blockchain for Education: A Study on Digital Accreditation of Personal and Ac...
Blockchain for Education: A Study on Digital Accreditation of Personal and Ac...Anthony Fisher Camilleri
 
Constructing Knowledge Graph for Social Networks in a Deep and Holistic Way
Constructing Knowledge Graph for Social Networks in a Deep and Holistic WayConstructing Knowledge Graph for Social Networks in a Deep and Holistic Way
Constructing Knowledge Graph for Social Networks in a Deep and Holistic WayBaoxu Shi
 
DLT analytics and AI workshop 17 October 2019 WELCOME
DLT analytics and AI workshop 17 October 2019 WELCOME DLT analytics and AI workshop 17 October 2019 WELCOME
DLT analytics and AI workshop 17 October 2019 WELCOME Stavros Zervoudakis
 
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...Stefan Dietze
 
Personalised Access to Linked Data
Personalised Access to Linked DataPersonalised Access to Linked Data
Personalised Access to Linked DataMilan Dojchinovski
 
Dariah vcc3 2505-2013_displaying
Dariah vcc3 2505-2013_displayingDariah vcc3 2505-2013_displaying
Dariah vcc3 2505-2013_displayingMinel Jean-Luc
 
20120622 web sci12-won-marc smith-semantic and social network analysis of …
20120622 web sci12-won-marc smith-semantic and social network analysis of …20120622 web sci12-won-marc smith-semantic and social network analysis of …
20120622 web sci12-won-marc smith-semantic and social network analysis of …Marc Smith
 
Blockchain in Learning & Career Development: The Case of the Open Source Univ...
Blockchain in Learning & Career Development: The Case of the Open Source Univ...Blockchain in Learning & Career Development: The Case of the Open Source Univ...
Blockchain in Learning & Career Development: The Case of the Open Source Univ...Hristian Daskalov
 
Microtask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked DataMicrotask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked DataEUCLID project
 
Linked Open Data Visualization
Linked Open Data VisualizationLinked Open Data Visualization
Linked Open Data VisualizationLaura Po
 
Digital communication (v. 2021 ITA)
Digital communication (v. 2021 ITA)Digital communication (v. 2021 ITA)
Digital communication (v. 2021 ITA)Frieda Brioschi
 
Semantic Tagging for the XWiki Platform with Zemanta and DBpedia
Semantic Tagging for the XWiki Platform with Zemanta and DBpediaSemantic Tagging for the XWiki Platform with Zemanta and DBpedia
Semantic Tagging for the XWiki Platform with Zemanta and DBpediaElena-Oana Tabaranu
 
Extending DCAM for Metadata Provenance
Extending DCAM for Metadata ProvenanceExtending DCAM for Metadata Provenance
Extending DCAM for Metadata ProvenanceKai Eckert
 
EUDAT Webinar "Organise, retrieve and aggregate data using annotations with B...
EUDAT Webinar "Organise, retrieve and aggregate data using annotations with B...EUDAT Webinar "Organise, retrieve and aggregate data using annotations with B...
EUDAT Webinar "Organise, retrieve and aggregate data using annotations with B...EUDAT
 
Computer Vision in Academia and Industry (Dmytro Mishkin Technology Stream)
Computer Vision in Academia and Industry (Dmytro Mishkin Technology Stream)Computer Vision in Academia and Industry (Dmytro Mishkin Technology Stream)
Computer Vision in Academia and Industry (Dmytro Mishkin Technology Stream)IT Arena
 
Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things PayamBarnaghi
 

Similar a Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia (20)

LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORELOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
 
Structured Data Presentation
Structured Data PresentationStructured Data Presentation
Structured Data Presentation
 
Lod2
Lod2Lod2
Lod2
 
Blockchain for Education: A Study on Digital Accreditation of Personal and Ac...
Blockchain for Education: A Study on Digital Accreditation of Personal and Ac...Blockchain for Education: A Study on Digital Accreditation of Personal and Ac...
Blockchain for Education: A Study on Digital Accreditation of Personal and Ac...
 
Constructing Knowledge Graph for Social Networks in a Deep and Holistic Way
Constructing Knowledge Graph for Social Networks in a Deep and Holistic WayConstructing Knowledge Graph for Social Networks in a Deep and Holistic Way
Constructing Knowledge Graph for Social Networks in a Deep and Holistic Way
 
DLT analytics and AI workshop 17 October 2019 WELCOME
DLT analytics and AI workshop 17 October 2019 WELCOME DLT analytics and AI workshop 17 October 2019 WELCOME
DLT analytics and AI workshop 17 October 2019 WELCOME
 
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
Open Education Challenge 2014: exploiting Linked Data in Educational Applicat...
 
Personalised Access to Linked Data
Personalised Access to Linked DataPersonalised Access to Linked Data
Personalised Access to Linked Data
 
Dariah vcc3 2505-2013_displaying
Dariah vcc3 2505-2013_displayingDariah vcc3 2505-2013_displaying
Dariah vcc3 2505-2013_displaying
 
Building arguments on Open Data
Building arguments on Open DataBuilding arguments on Open Data
Building arguments on Open Data
 
20120622 web sci12-won-marc smith-semantic and social network analysis of …
20120622 web sci12-won-marc smith-semantic and social network analysis of …20120622 web sci12-won-marc smith-semantic and social network analysis of …
20120622 web sci12-won-marc smith-semantic and social network analysis of …
 
Blockchain in Learning & Career Development: The Case of the Open Source Univ...
Blockchain in Learning & Career Development: The Case of the Open Source Univ...Blockchain in Learning & Career Development: The Case of the Open Source Univ...
Blockchain in Learning & Career Development: The Case of the Open Source Univ...
 
Microtask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked DataMicrotask Crowdsourcing Applications for Linked Data
Microtask Crowdsourcing Applications for Linked Data
 
Linked Open Data Visualization
Linked Open Data VisualizationLinked Open Data Visualization
Linked Open Data Visualization
 
Digital communication (v. 2021 ITA)
Digital communication (v. 2021 ITA)Digital communication (v. 2021 ITA)
Digital communication (v. 2021 ITA)
 
Semantic Tagging for the XWiki Platform with Zemanta and DBpedia
Semantic Tagging for the XWiki Platform with Zemanta and DBpediaSemantic Tagging for the XWiki Platform with Zemanta and DBpedia
Semantic Tagging for the XWiki Platform with Zemanta and DBpedia
 
Extending DCAM for Metadata Provenance
Extending DCAM for Metadata ProvenanceExtending DCAM for Metadata Provenance
Extending DCAM for Metadata Provenance
 
EUDAT Webinar "Organise, retrieve and aggregate data using annotations with B...
EUDAT Webinar "Organise, retrieve and aggregate data using annotations with B...EUDAT Webinar "Organise, retrieve and aggregate data using annotations with B...
EUDAT Webinar "Organise, retrieve and aggregate data using annotations with B...
 
Computer Vision in Academia and Industry (Dmytro Mishkin Technology Stream)
Computer Vision in Academia and Industry (Dmytro Mishkin Technology Stream)Computer Vision in Academia and Industry (Dmytro Mishkin Technology Stream)
Computer Vision in Academia and Industry (Dmytro Mishkin Technology Stream)
 
Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things
 

Último

Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 

Último (20)

Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 

Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia

  • 1. University of Economics Czech Technical University Prague in Prague Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia Milan Dojchinovski1, Tomas Kliegr2 1 Faculty of Information Technology 2Faculty of Informatics and Statistics Czech Technical University in Prague University of Economics, Prague Milan Dojchinovski milan.dojchinovski@fit.cvut.cz - @m1ci - http://dojchinovski.mk The 7th Workshop on Intelligent and Knowledge Oriented Technologies (WIKT 2012) November 22-23, 2012, Smolenice, SK Except where otherwise noted, the content of this presentation is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported
  • 2. Overview ‣ Introduction ‣ Entity Recognition, Classification and Publication ‣ Experiments ‣ Conclusion and Future Work Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 2
  • 3. Introduction ‣ Unsupervised and fully-automated: - entity recognition - rule based lexico-syntactic patterns - entity classification by extraction of hypernyms - targeted hypernym extraction - entity linking to DBpedia concepts ‣ Publication as Linked Data - results in NLP Interchange Format (NIF) Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 3
  • 4. Overview ‣ Introduction ‣ Entity Recognition, Classification and Publication ‣ Experiments ‣ Conclusion and Future Work Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 4
  • 5. Tool Architecture ‣ Available as Web 2.0 application at: http://ner.vse.cz/thd ‣ Web API available at: http://ner.vse.cz/thd/docs Fig 1. Architecture overview Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 5
  • 6. Entity Recognition and Classification ‣ Entity Recognition - 2 JAPE grammars: 1) NNP+ 2) JJ* NN+ - input: free text - output: Named (e.g., Diego Maradona ) or Common Entities (e.g., hockey player ) ‣ Entity Classification - supported by the Targeted Hypernym Discovery algorithm - lexico-syntactic patterns, e.g. _x_ is a _y_ Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 6
  • 7. Entity Linking and Publication ‣ Entity Linking - linking with concepts from DBpedia - used Wikipedia Search API - mapping Wikipedia article URL to its DBpedia representation ‣ Publication in NIF - NLP Interchange Format (RDF-based representation) - each processed document (context) has unique identifier - each entity and hypernym as offset-based string Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 7
  • 8. Overview ‣ Introduction ‣ Entity Recognition, Classification and Publication ‣ Experiments ‣ Conclusion and Future Work Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 8
  • 9. Experiments ‣ Question addressed - How well our tool recognizes, classifies and links Named and Common Entities? ‣ Experiment setup - manually created dataset, Czech Traveler Dataset - 101 Named Entities, 85 Common Entities - comparison with 3 other systems: DBpedia Spotlight, Open Calais, Alchemy API ‣ Results - Named Entities, • f-score: recognition 0.66, classification 0.66, linking 0.58 - Common Entities • f-score: recognition 0.60, classification 0.51, linking 0.61 - better results in all tasks • overtaken only by DBpedia Spotlight - linking of common entities with f-score 0.69 Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 9
  • 10. Overview ‣ Introduction ‣ Entity Recognition, Classification and Publication ‣ Experiments ‣ Conclusion and Future Work Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 10
  • 11. Conclusion and Future Work ‣ Tool for Entity Recognition, Classification and Publication ‣ Future directions - multilingual support - Dutch, German and Czech language - grammar improvements - evaluation on a standard benchmark Recognizing, Classifying and Linking Entities with Wikipedia and DBpedia - @m1ci - http://dojchinovski.mk 11
  • 12. Feedback Thank you! Questions, comments, ideas? demo at: http://ner.vse.cz/thd Milan Dojchinovski @m1ci milan.dojchinovski@fit.cvut.cz http://dojchinovski.mk Except where otherwise noted, the content of this presentation is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported 12