SlideShare una empresa de Scribd logo
1 de 36
Creating Knowledge out of Interlinked Data
http://lod2.eu

ISWC – 2013/10/23 – Page 1

Integrating NLP using Linked Data
Sebastian Hellmann, Jens Lehmann, Sören Auer and Martin Brümmer

http://slideshare.net/kurzum
http://nlp2rdf.org
http://lod2.eu

LOD2 Presentation . 02.09.2010 . Page

AKSW, Universität Leipzig

http://lod2.eu
ISWC – 2013/10/23 – Page 2

Introduction

http://lod2.eu
ISWC – 2013/10/23 – Page 3

Introduction

Core problems in integrating NLP:
1. Too much heterogeneity
2. Almost no open standards available
3. Lack of open collaboration
4. Difficult and large domain

http://lod2.eu
ISWC – 2013/10/23 – Page 4

Problem analysis
Hardly any reusability in NLP
• Free software (as in free beer), but no open licenses
• Few standards and few mappings
• Integration is hard-wired (you have to write software)
– for each tool, for each framework
Main benefits of using RDF, OWL and Linked Data are:
• lower entry barrier (as a client / user)
• easy data integration (linking, mapping)
• reusability of tools and conceptualisations (ontologies)
• off-the-shelf solutions for common tasks

http://lod2.eu
ISWC – 2013/10/23 – Page 5

The Semantic Gap

http://lod2.eu
ISWC – 2013/10/23 – Page 6

http://lod2.eu
ISWC – 2013/10/23 – Page 7

NLP2RDF project
NLP2RDF (http://nlp2rdf.org)
- community project bootstrapped by LOD2
- develops NLP Interchange Format (NIF)
- umbrella project to combine (and consolidate) existing work

http://lod2.eu
ISWC – 2013/10/23 – Page 8

NIF Overview
The NLP Interchange Format (NIF) is an RDF/OWL-based format that aims to
achieve interoperability between Natural Language Processing (NLP) tools,
language resources and annotations.
→ to create an eco-system of interopable web services

http://lod2.eu
ISWC – 2013/10/23 – Page 9

http://lod2.eu

NIF Overview
The NLP Interchange Format (NIF) is an RDF/OWL-based format that aims to
achieve interoperability between Natural Language Processing (NLP) tools,
language resources and annotations.

•

Reuse of existing standards such as RDF, OWL2, the PROV Ontology, LAF (ISO
24612), Unicode and RFC 5147

•

Standardize access parameters, annotations (e.g. tokenization), validation
and log messages

•

Reuse of existing ontologies:
ISWC – 2013/10/23 – Page 10

http://lod2.eu

Example NIF Workflow

NIF workflow, however, can obviously not provide any better performance (Fmeasure, speed) than a properly configured UIMA or GATE pipeline with the same
components.
ISWC – 2013/10/23 – Page 11

Use Cases
•
•
•

Internationalization TagSet 2.0
Part of Speech Tagging
Wikifier API access via RDFaCE (Entity Linking)

http://lod2.eu
ISWC – 2013/10/23 – Page 12

http://lod2.eu

UC1 - Internationalisation Tagset 2.0

•

NIF will be the recommended RDF conversion of the Internationalisation
Tagset 2.0 of W3C (ITS 2.0) - http://www.w3.org/TR/its20/

•

NIF turns out to have a unique selling proposition regarding NLP and RDF

•

There were no suitable alternative RDF vocabulary for this conversion
available.
ISWC – 2013/10/23 – Page 13

Source: http://www.w3.org/TR/its20/#EX-HTML-whitespace-normalization

http://lod2.eu

ITS 2.0

RDFa parsers loose all provenance information:
<http://examples.com/books/wikinomics> dc:title ''Wikinomics'' .

Source: https://en.wikipedia.org/wiki/RDFa
ISWC – 2013/10/23 – Page 14

UC1 - Internationalisation Tagset 2.0

http://lod2.eu
ISWC – 2013/10/23 – Page 15

UC1 - Internationalisation Tagset 2.0

String offset based on:
- Unicode NFC, code points
- ISO 24612
- RFC 5147

http://lod2.eu
http://lod2.eu

ISWC – 2013/10/23 – Page 16

UC2 – Part of Speech Tagging

Please see the paper:

http://purl.org/olia
ISWC – 2013/10/23 – Page 17

UC3 – Wikifier API access via RDFaCE

https://github.com/dbpedia-spotlight/dbpedia-spotlight/wiki

http://lod2.eu
ISWC – 2013/10/23 – Page 18

UC3 - Wikifier API access via RDFaCE
http://rdface.aksw.org/

http://lod2.eu
ISWC – 2013/10/23 – Page 19

UC3 - Wikifier API access via RDFaCE
http://rdface.aksw.org/

http://lod2.eu
ISWC – 2013/10/23 – Page 20

Evaluation
Please see the paper!
1) Quantitative Analysis with Google Wikilinks Corpus as NIF RDF
• Crawl of 3 million web sites, 40 million Wikipedia links
• ~ 477 million triples in NIF
2) Questionnaire and Developers Study for NIF 1.0
• NIF 1.0 was released in September 2009
• Over 30 known implementations (22 not from authors)
• 14 developers participated in the study
• Minimal NIF implementation requires less than 500 LoC
3) Qualitative Comparison with other Frameworks and Formats

http://lod2.eu
ISWC – 2013/10/23 – Page 21

State of NIF 2.0
Corpora as Linked Data
• Wikilinks corpus - http://wiki-link.nlp2rdf.org
• KORE 50 - http://www.yovisto.com/labs/ner-benchmarks/
• DBpedia Spotlight dataset
Tools
• entityclassifier.eu – http://entityclassifier.eu
• Spotlight - http://spotlight.dbpedia.org
• Open NLP
• Stanford CoreNLP - https://github.com/NLP2RDF/software
• Validator - https://github.com/NLP2RDF/software

http://lod2.eu
ISWC – 2013/10/23 – Page 22

State of NIF 2.0
•
•
•

Rollout is in progress
Distributed implementation at different speed and quality
Software lifecycle:
• Implementation
• Testing/Validation
• Integration in the main software
• Deployment as a web service

•

Hosted web services often not up to date while code base is

http://lod2.eu
ISWC – 2013/10/23 – Page 23

How to join - http://nlp2rdf.org

http://lod2.eu
ISWC – 2013/10/23 – Page 24

For ontology creators
NLP2RDF provides infrastructure for your NLP ontologies

•
•
•
•
•
•

Redundant, persistent hosting
Maven packages
Code and documentation generation
Continuous Integration (planned)
Indexing
Validation of instance data

Please write to me or the mailing list
nlp2rdf@lists.informatik.uni-leipzig.de

http://lod2.eu
http://lod2.eu

ISWC – 2013/10/23 – Page 25

Take home message
•

Early industrial uptake
• OpenLink, Vistatech.ie, Zemanta, Tenforce, Unister
• ITS 2.0 W3C standard was driven by localization industry

•
•

NIF is open and free (CC0 planned)
NIF is designed to be a cost-saver

Not primarily aimed at
increasing features or
performance (F-Measure)
ISWC – 2013/10/23 – Page 26

Thanks for your attention
Open Community – All feedback is welcome!
http://slideshare.net/kurzum
Websites:
http://nlp2rdf.org
http://lod2.eu

http://lod2.eu
ISWC – 2013/10/23 – Page 27

Annotations

http://lod2.eu
ISWC – 2013/10/23 – Page 28

NIF

http://lod2.eu
ISWC – 2013/10/23 – Page 29

Scalability - Salzburg Research KMT

https://bitbucket.org/srfgkmt/stanbol-nlp

http://lod2.eu
ISWC – 2013/10/23 – Page 30

Unicode Normal Form C

•
•

Recommendation for RDF Literals
http://unicode.org/reports/tr15/#Norm_Forms

http://lod2.eu
ISWC – 2013/10/23 – Page 31

Tokenization

Christian Chiarcos, Julia Ritz, Manfred Stede: By all these lovely tokens... Merging conflicting tokenizations.
Language Resources and Evaluation 46(1): 53-74 (2012)

http://lod2.eu
http://lod2.eu

ISWC – 2013/10/23 – Page 32

Validation over specification

•
•
•
•
•
•

SPARQL queries produce (find) errors

http://persistence.uni-leipzig.org/nlp2rdf/ontologies/testcase/lib/nif-2.0-suite.t
RLOG – An RDF Logging Ontology
./validate.jar -i nif-erroneous-model.ttl -t file
Demo → character count
Demo → all errors

ALL DEMOS ARE AVAILABLE AT:
http://nlp2rdf.org/leipzig-24-9-2013
ISWC – 2013/10/23 – Page 33

NIF

Demo:
http://nlp2rdf.lod2.eu/demo.php

http://lod2.eu
ISWC – 2013/10/23 – Page 34

OLiA

http://purl.org/olia

http://lod2.eu
ISWC – 2013/10/23 – Page 35

NIF

http://lod2.eu
ISWC – 2013/10/23 – Page 36

NIF

http://lod2.eu

Más contenido relacionado

La actualidad más candente (6)

LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORELOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
LOD2 Webinar Series Classification and Quality Analysis with DL Learner and ORE
 
LOD2 Webinar Series: D2R and Sparqlify
LOD2 Webinar Series: D2R and SparqlifyLOD2 Webinar Series: D2R and Sparqlify
LOD2 Webinar Series: D2R and Sparqlify
 
LOD2 Plenary Vienna 2012: WP3 - Knowledge Base Creation, Enrichment and Repair
LOD2 Plenary Vienna 2012: WP3 - Knowledge Base Creation, Enrichment and RepairLOD2 Plenary Vienna 2012: WP3 - Knowledge Base Creation, Enrichment and Repair
LOD2 Plenary Vienna 2012: WP3 - Knowledge Base Creation, Enrichment and Repair
 
LOD2 Webinar Series: CubeViz
LOD2 Webinar Series: CubeViz LOD2 Webinar Series: CubeViz
LOD2 Webinar Series: CubeViz
 
LOD2 Webinar Series FOX
LOD2 Webinar Series FOXLOD2 Webinar Series FOX
LOD2 Webinar Series FOX
 
LOD2 Webinar: SIREn
LOD2 Webinar: SIREnLOD2 Webinar: SIREn
LOD2 Webinar: SIREn
 

Similar a Integrating NLP using Linked Data

NIF 2.0 Phd thesis intermediate report
NIF 2.0 Phd thesis intermediate reportNIF 2.0 Phd thesis intermediate report
NIF 2.0 Phd thesis intermediate reportSebastian Hellmann
 
NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23Sebastian Hellmann
 
Incubating Apache Linda (ApacheCon Europe 2012)
Incubating Apache Linda (ApacheCon Europe 2012)Incubating Apache Linda (ApacheCon Europe 2012)
Incubating Apache Linda (ApacheCon Europe 2012)Sergio Fernández
 
Linked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and SegmentationLinked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and SegmentationSebastian Hellmann
 
Oc wg-nif-20130711
Oc wg-nif-20130711Oc wg-nif-20130711
Oc wg-nif-20130711STIinnsbruck
 
Linked Data in Linguistics for NLP and Web Annotation
Linked Data in Linguistics for NLP and Web AnnotationLinked Data in Linguistics for NLP and Web Annotation
Linked Data in Linguistics for NLP and Web AnnotationSebastian Hellmann
 
Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013François Belleau
 
Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012François Belleau
 
Linked data and semantic wikis
Linked data and semantic wikisLinked data and semantic wikis
Linked data and semantic wikisSören Auer
 
Improving the Performance of the DL-Learner SPARQL Component for Semantic We...
Improving the Performance of the  DL-Learner SPARQL Component for Semantic We...Improving the Performance of the  DL-Learner SPARQL Component for Semantic We...
Improving the Performance of the DL-Learner SPARQL Component for Semantic We...Sebastian Hellmann
 
Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...
Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...
Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...semanticsconference
 
Cloud open unveillithium-odlnewrelease-2-ns
Cloud open unveillithium-odlnewrelease-2-nsCloud open unveillithium-odlnewrelease-2-ns
Cloud open unveillithium-odlnewrelease-2-nsNEC Corporation
 
Presentation of lpOD (ODF automation platform) at FOSDEM 2010
Presentation of lpOD (ODF automation platform) at FOSDEM 2010Presentation of lpOD (ODF automation platform) at FOSDEM 2010
Presentation of lpOD (ODF automation platform) at FOSDEM 2010Itaapy
 
From Open Linked Data towards an Ecosystem of Interlinked Knowledge
From Open Linked Data towards an Ecosystem of Interlinked KnowledgeFrom Open Linked Data towards an Ecosystem of Interlinked Knowledge
From Open Linked Data towards an Ecosystem of Interlinked KnowledgeSören Auer
 
IPMI is dead, Long live Redfish
IPMI is dead, Long live RedfishIPMI is dead, Long live Redfish
IPMI is dead, Long live RedfishBruno Cornec
 
Linguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future WorkLinguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future WorkSebastian Hellmann
 

Similar a Integrating NLP using Linked Data (20)

NIF 2.0 Phd thesis intermediate report
NIF 2.0 Phd thesis intermediate reportNIF 2.0 Phd thesis intermediate report
NIF 2.0 Phd thesis intermediate report
 
NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23NIF - Version 1.0 - 2011/10/23
NIF - Version 1.0 - 2011/10/23
 
Incubating Apache Linda (ApacheCon Europe 2012)
Incubating Apache Linda (ApacheCon Europe 2012)Incubating Apache Linda (ApacheCon Europe 2012)
Incubating Apache Linda (ApacheCon Europe 2012)
 
NIF 2.0 draft for Pisa
NIF 2.0 draft for PisaNIF 2.0 draft for Pisa
NIF 2.0 draft for Pisa
 
Linked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and SegmentationLinked Data for Abbreviations and Segmentation
Linked Data for Abbreviations and Segmentation
 
Oc wg-nif-20130711
Oc wg-nif-20130711Oc wg-nif-20130711
Oc wg-nif-20130711
 
Linked Data in Linguistics for NLP and Web Annotation
Linked Data in Linguistics for NLP and Web AnnotationLinked Data in Linguistics for NLP and Web Annotation
Linked Data in Linguistics for NLP and Web Annotation
 
Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013Producing, publishing and consuming linked data - CSHALS 2013
Producing, publishing and consuming linked data - CSHALS 2013
 
LOD2: State of Play WP3A - Knowledge Base Creation, Enrichment and Repair
LOD2: State of Play WP3A - Knowledge Base Creation, Enrichment and RepairLOD2: State of Play WP3A - Knowledge Base Creation, Enrichment and Repair
LOD2: State of Play WP3A - Knowledge Base Creation, Enrichment and Repair
 
Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012Bio2RDF presentation at Combine 2012
Bio2RDF presentation at Combine 2012
 
Linked data and semantic wikis
Linked data and semantic wikisLinked data and semantic wikis
Linked data and semantic wikis
 
Improving the Performance of the DL-Learner SPARQL Component for Semantic We...
Improving the Performance of the  DL-Learner SPARQL Component for Semantic We...Improving the Performance of the  DL-Learner SPARQL Component for Semantic We...
Improving the Performance of the DL-Learner SPARQL Component for Semantic We...
 
Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...
Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...
Phil Ritchie | Putting Standards into Action: Multilingual and Semantic Enric...
 
Cloud open unveillithium-odlnewrelease-2-ns
Cloud open unveillithium-odlnewrelease-2-nsCloud open unveillithium-odlnewrelease-2-ns
Cloud open unveillithium-odlnewrelease-2-ns
 
OOoCon Lpod
OOoCon LpodOOoCon Lpod
OOoCon Lpod
 
Presentation of lpOD (ODF automation platform) at FOSDEM 2010
Presentation of lpOD (ODF automation platform) at FOSDEM 2010Presentation of lpOD (ODF automation platform) at FOSDEM 2010
Presentation of lpOD (ODF automation platform) at FOSDEM 2010
 
From Open Linked Data towards an Ecosystem of Interlinked Knowledge
From Open Linked Data towards an Ecosystem of Interlinked KnowledgeFrom Open Linked Data towards an Ecosystem of Interlinked Knowledge
From Open Linked Data towards an Ecosystem of Interlinked Knowledge
 
IPMI is dead, Long live Redfish
IPMI is dead, Long live RedfishIPMI is dead, Long live Redfish
IPMI is dead, Long live Redfish
 
OpenDaylight nluug_november
OpenDaylight nluug_novemberOpenDaylight nluug_november
OpenDaylight nluug_november
 
Linguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future WorkLinguistic Linked Open Data, Challenges, Approaches, Future Work
Linguistic Linked Open Data, Challenges, Approaches, Future Work
 

Más de Sebastian Hellmann

DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016Sebastian Hellmann
 
Lider Reference Model ld4lt session March, 3rd, 2015
Lider Reference Model ld4lt session  March, 3rd, 2015Lider Reference Model ld4lt session  March, 3rd, 2015
Lider Reference Model ld4lt session March, 3rd, 2015Sebastian Hellmann
 
LD4LT Roadmap session 19_02_2015
LD4LT Roadmap session 19_02_2015LD4LT Roadmap session 19_02_2015
LD4LT Roadmap session 19_02_2015Sebastian Hellmann
 
DBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataDBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataSebastian Hellmann
 
Navigation-induced Knowledge Engineering by Example
 Navigation-induced Knowledge Engineering by Example Navigation-induced Knowledge Engineering by Example
Navigation-induced Knowledge Engineering by ExampleSebastian Hellmann
 
NLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draftNLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draftSebastian Hellmann
 

Más de Sebastian Hellmann (10)

KEDL DBpedia 2019
KEDL DBpedia  2019KEDL DBpedia  2019
KEDL DBpedia 2019
 
DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016DBpedia/association Introduction The Hague 12.2.2016
DBpedia/association Introduction The Hague 12.2.2016
 
Lider Reference Model ld4lt session March, 3rd, 2015
Lider Reference Model ld4lt session  March, 3rd, 2015Lider Reference Model ld4lt session  March, 3rd, 2015
Lider Reference Model ld4lt session March, 3rd, 2015
 
LD4LT Roadmap session 19_02_2015
LD4LT Roadmap session 19_02_2015LD4LT Roadmap session 19_02_2015
LD4LT Roadmap session 19_02_2015
 
DBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of DataDBpedia: A Public Data Infrastructure for the Web of Data
DBpedia: A Public Data Infrastructure for the Web of Data
 
Navigation-induced Knowledge Engineering by Example
 Navigation-induced Knowledge Engineering by Example Navigation-induced Knowledge Engineering by Example
Navigation-induced Knowledge Engineering by Example
 
Introduction to LDL 2012
Introduction to LDL 2012Introduction to LDL 2012
Introduction to LDL 2012
 
Thesis presentation
Thesis presentationThesis presentation
Thesis presentation
 
Tool collection as linkeddata
Tool collection as linkeddataTool collection as linkeddata
Tool collection as linkeddata
 
NLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draftNLP2RDF Wortschatz and Linguistic LOD draft
NLP2RDF Wortschatz and Linguistic LOD draft
 

Último

AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfphamnguyenenglishnb
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Celine George
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxnelietumpap1
 

Último (20)

AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdfAMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
AMERICAN LANGUAGE HUB_Level2_Student'sBook_Answerkey.pdf
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17Difference Between Search & Browse Methods in Odoo 17
Difference Between Search & Browse Methods in Odoo 17
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptx
 

Integrating NLP using Linked Data

  • 1. Creating Knowledge out of Interlinked Data http://lod2.eu ISWC – 2013/10/23 – Page 1 Integrating NLP using Linked Data Sebastian Hellmann, Jens Lehmann, Sören Auer and Martin Brümmer http://slideshare.net/kurzum http://nlp2rdf.org http://lod2.eu LOD2 Presentation . 02.09.2010 . Page AKSW, Universität Leipzig http://lod2.eu
  • 2. ISWC – 2013/10/23 – Page 2 Introduction http://lod2.eu
  • 3. ISWC – 2013/10/23 – Page 3 Introduction Core problems in integrating NLP: 1. Too much heterogeneity 2. Almost no open standards available 3. Lack of open collaboration 4. Difficult and large domain http://lod2.eu
  • 4. ISWC – 2013/10/23 – Page 4 Problem analysis Hardly any reusability in NLP • Free software (as in free beer), but no open licenses • Few standards and few mappings • Integration is hard-wired (you have to write software) – for each tool, for each framework Main benefits of using RDF, OWL and Linked Data are: • lower entry barrier (as a client / user) • easy data integration (linking, mapping) • reusability of tools and conceptualisations (ontologies) • off-the-shelf solutions for common tasks http://lod2.eu
  • 5. ISWC – 2013/10/23 – Page 5 The Semantic Gap http://lod2.eu
  • 6. ISWC – 2013/10/23 – Page 6 http://lod2.eu
  • 7. ISWC – 2013/10/23 – Page 7 NLP2RDF project NLP2RDF (http://nlp2rdf.org) - community project bootstrapped by LOD2 - develops NLP Interchange Format (NIF) - umbrella project to combine (and consolidate) existing work http://lod2.eu
  • 8. ISWC – 2013/10/23 – Page 8 NIF Overview The NLP Interchange Format (NIF) is an RDF/OWL-based format that aims to achieve interoperability between Natural Language Processing (NLP) tools, language resources and annotations. → to create an eco-system of interopable web services http://lod2.eu
  • 9. ISWC – 2013/10/23 – Page 9 http://lod2.eu NIF Overview The NLP Interchange Format (NIF) is an RDF/OWL-based format that aims to achieve interoperability between Natural Language Processing (NLP) tools, language resources and annotations. • Reuse of existing standards such as RDF, OWL2, the PROV Ontology, LAF (ISO 24612), Unicode and RFC 5147 • Standardize access parameters, annotations (e.g. tokenization), validation and log messages • Reuse of existing ontologies:
  • 10. ISWC – 2013/10/23 – Page 10 http://lod2.eu Example NIF Workflow NIF workflow, however, can obviously not provide any better performance (Fmeasure, speed) than a properly configured UIMA or GATE pipeline with the same components.
  • 11. ISWC – 2013/10/23 – Page 11 Use Cases • • • Internationalization TagSet 2.0 Part of Speech Tagging Wikifier API access via RDFaCE (Entity Linking) http://lod2.eu
  • 12. ISWC – 2013/10/23 – Page 12 http://lod2.eu UC1 - Internationalisation Tagset 2.0 • NIF will be the recommended RDF conversion of the Internationalisation Tagset 2.0 of W3C (ITS 2.0) - http://www.w3.org/TR/its20/ • NIF turns out to have a unique selling proposition regarding NLP and RDF • There were no suitable alternative RDF vocabulary for this conversion available.
  • 13. ISWC – 2013/10/23 – Page 13 Source: http://www.w3.org/TR/its20/#EX-HTML-whitespace-normalization http://lod2.eu ITS 2.0 RDFa parsers loose all provenance information: <http://examples.com/books/wikinomics> dc:title ''Wikinomics'' . Source: https://en.wikipedia.org/wiki/RDFa
  • 14. ISWC – 2013/10/23 – Page 14 UC1 - Internationalisation Tagset 2.0 http://lod2.eu
  • 15. ISWC – 2013/10/23 – Page 15 UC1 - Internationalisation Tagset 2.0 String offset based on: - Unicode NFC, code points - ISO 24612 - RFC 5147 http://lod2.eu
  • 16. http://lod2.eu ISWC – 2013/10/23 – Page 16 UC2 – Part of Speech Tagging Please see the paper: http://purl.org/olia
  • 17. ISWC – 2013/10/23 – Page 17 UC3 – Wikifier API access via RDFaCE https://github.com/dbpedia-spotlight/dbpedia-spotlight/wiki http://lod2.eu
  • 18. ISWC – 2013/10/23 – Page 18 UC3 - Wikifier API access via RDFaCE http://rdface.aksw.org/ http://lod2.eu
  • 19. ISWC – 2013/10/23 – Page 19 UC3 - Wikifier API access via RDFaCE http://rdface.aksw.org/ http://lod2.eu
  • 20. ISWC – 2013/10/23 – Page 20 Evaluation Please see the paper! 1) Quantitative Analysis with Google Wikilinks Corpus as NIF RDF • Crawl of 3 million web sites, 40 million Wikipedia links • ~ 477 million triples in NIF 2) Questionnaire and Developers Study for NIF 1.0 • NIF 1.0 was released in September 2009 • Over 30 known implementations (22 not from authors) • 14 developers participated in the study • Minimal NIF implementation requires less than 500 LoC 3) Qualitative Comparison with other Frameworks and Formats http://lod2.eu
  • 21. ISWC – 2013/10/23 – Page 21 State of NIF 2.0 Corpora as Linked Data • Wikilinks corpus - http://wiki-link.nlp2rdf.org • KORE 50 - http://www.yovisto.com/labs/ner-benchmarks/ • DBpedia Spotlight dataset Tools • entityclassifier.eu – http://entityclassifier.eu • Spotlight - http://spotlight.dbpedia.org • Open NLP • Stanford CoreNLP - https://github.com/NLP2RDF/software • Validator - https://github.com/NLP2RDF/software http://lod2.eu
  • 22. ISWC – 2013/10/23 – Page 22 State of NIF 2.0 • • • Rollout is in progress Distributed implementation at different speed and quality Software lifecycle: • Implementation • Testing/Validation • Integration in the main software • Deployment as a web service • Hosted web services often not up to date while code base is http://lod2.eu
  • 23. ISWC – 2013/10/23 – Page 23 How to join - http://nlp2rdf.org http://lod2.eu
  • 24. ISWC – 2013/10/23 – Page 24 For ontology creators NLP2RDF provides infrastructure for your NLP ontologies • • • • • • Redundant, persistent hosting Maven packages Code and documentation generation Continuous Integration (planned) Indexing Validation of instance data Please write to me or the mailing list nlp2rdf@lists.informatik.uni-leipzig.de http://lod2.eu
  • 25. http://lod2.eu ISWC – 2013/10/23 – Page 25 Take home message • Early industrial uptake • OpenLink, Vistatech.ie, Zemanta, Tenforce, Unister • ITS 2.0 W3C standard was driven by localization industry • • NIF is open and free (CC0 planned) NIF is designed to be a cost-saver Not primarily aimed at increasing features or performance (F-Measure)
  • 26. ISWC – 2013/10/23 – Page 26 Thanks for your attention Open Community – All feedback is welcome! http://slideshare.net/kurzum Websites: http://nlp2rdf.org http://lod2.eu http://lod2.eu
  • 27. ISWC – 2013/10/23 – Page 27 Annotations http://lod2.eu
  • 28. ISWC – 2013/10/23 – Page 28 NIF http://lod2.eu
  • 29. ISWC – 2013/10/23 – Page 29 Scalability - Salzburg Research KMT https://bitbucket.org/srfgkmt/stanbol-nlp http://lod2.eu
  • 30. ISWC – 2013/10/23 – Page 30 Unicode Normal Form C • • Recommendation for RDF Literals http://unicode.org/reports/tr15/#Norm_Forms http://lod2.eu
  • 31. ISWC – 2013/10/23 – Page 31 Tokenization Christian Chiarcos, Julia Ritz, Manfred Stede: By all these lovely tokens... Merging conflicting tokenizations. Language Resources and Evaluation 46(1): 53-74 (2012) http://lod2.eu
  • 32. http://lod2.eu ISWC – 2013/10/23 – Page 32 Validation over specification • • • • • • SPARQL queries produce (find) errors http://persistence.uni-leipzig.org/nlp2rdf/ontologies/testcase/lib/nif-2.0-suite.t RLOG – An RDF Logging Ontology ./validate.jar -i nif-erroneous-model.ttl -t file Demo → character count Demo → all errors ALL DEMOS ARE AVAILABLE AT: http://nlp2rdf.org/leipzig-24-9-2013
  • 33. ISWC – 2013/10/23 – Page 33 NIF Demo: http://nlp2rdf.lod2.eu/demo.php http://lod2.eu
  • 34. ISWC – 2013/10/23 – Page 34 OLiA http://purl.org/olia http://lod2.eu
  • 35. ISWC – 2013/10/23 – Page 35 NIF http://lod2.eu
  • 36. ISWC – 2013/10/23 – Page 36 NIF http://lod2.eu