SlideShare una empresa de Scribd logo
1 de 25
Descargar para leer sin conexión
Entity Typing and Event Extraction
Marieke van Erp

http://mariekevanerp.com
VU WP3 team
Isa Maks Antske Fokkens Marieke van Erp Piek Vossen
Entity typing
What is entity typing?
• Entity typing is the task of classifying an entity mention
• An entity mention is a recognised name in a text that
refers to a real world person, location, organisation or
other interesting ‘thing’
What is the added value of entity typing?
• It allows you to query for fine-grained entity types: give
me all electricians in the dataset, give me all historic
buildings
• Entity typing often includes linking an entity to
background knowledge
• The background knowledge provides additional filters:
give me all politicians born after 1900 in the dataset
• Caveat: the background knowledge is not complete
New synonym/concept lists are easy to plug in
New synonym/concept lists are easy to plug in
Brouwers:
concept-100350 (ponstypiste)
isRelatedTo class-Schrijfkunst
concept-100343 (tachygraaf)
isRelatedTo class-Schrijfkunst
concept-100313 (schrijver)
isRelatedTo class-Schrijfkunst
Brouwers:
concept-100350 (ponstypiste)
isRelatedTo class-Schrijfkunst
concept-100343 (tachygraaf)
isRelatedTo class-Schrijfkunst
concept-100313 (schrijver)
isRelatedTo class-Schrijfkunst
Brouwers:
concept-100350 (ponstypiste)
isRelatedTo class-Schrijfkunst
concept-100343 (tachygraaf)
isRelatedTo class-Schrijfkunst
concept-100313 (schrijver)
isRelatedTo class-Schrijfkunst
Named Entity Recognition & Linking
• We are creating links between HISCO and Brouwers
• We are building on entity and concept linkers that can recognise
concepts from HISCO and Brouwers in texts
• We are developing a new general purpose entity linker that allows for
use of datasets other than DBpedia and is less sensitive to general
entity popularity
• Discovering more about Dark and NIL entities is also ongoing work
(cf. Van Erp & Vossen (2016) Entity Typing using Distributional
Semantics and DBpedia. To appear in: Proceedings of the 4th
NLP&DBpedia workshop. Kobe, Japan 18 October 2016)
Event Extraction
Event Extraction
• Event Extraction is the task of recognising and classifying
mentions of ‘things that happen’ in text
• Events are multifaceted: they take place at a certain time
and place and have participants involved
• By recognising participants, times and places, we can
generate event descriptions and compare events
From words to concepts
• Linking terms to synonyms to obtain a higher level of
abstraction
• Word-sense disambiguation + WordNet + Multilingual
Central Repository + Framenet + PropBank
• Stop, quit, leave, relinquish, bow out -> all linked to the
concept wn:leave_office
Why link to WordNet/ConceptNet/etc?
• It allows you to query for types rather than instances: give
me all lawsuits in the dataset
• In the context of CLARIAH, we are converting various
diachronous lexicons to Linked Data
• integrate resources
• tag interesting concepts in text
• query expansion
Semantic Role Labelling
• Detecting the agent, patient, recipient and theme of a
sentence
• Mary sold the book to John
• Agent: Mary
• Recipient: John
• Theme: the book
Event12
buy/sell fn:Seller
fn:Commerce_money_transfer
fn:Goods fn:Money
fn:Buyer
dbp:Porsche_fa
mily
dbp:QatarHolding
?Entity23
10% stake
type
Qatar Holding sells 10% stake in Porsche to
founding families
Porsche family buys back 10pc stake
from Qatar
http://english.alarabiya.net http://www.telegraph.co.uk
2013-06-17
sem:hasTime
2013-06-17
Event abstractions
• Enable searches such as: Give me all lawsuits in which a
politician was involved between 1990 and 2000.
• Current developments: expand resources to the historic
domain, devise new crystallisation strategies for
aggregating event information
Find out more
• All modules and evaluations are described in: http://
kyoto.let.vu.nl/newsreader_deliverables/NWR-D4-2-3.pdf
(158 pages!)
• Selection to be adapted within CLARIAH: https://
github.com/CLARIAH/wp3-semantic-parsing-Dutch
• New developments: http://www.clariah.nl & https://
github.com/clariah
Discussion
• It’s research software (no fancy interface)
• Currently not adapted to deal with old spelling variants/OCR/
etc
• NLP isn’t perfect (but humans don’t always agree either!)
• What would it take for you to start using such tools?
• What types of analyses are most interesting to the community?
• What use cases are most useful to the community at this point
in time?

Más contenido relacionado

Similar a Entity Typing and Event Extraction

Ce leeds presentation for delivery (short version)
Ce leeds presentation for delivery (short version)Ce leeds presentation for delivery (short version)
Ce leeds presentation for delivery (short version)KarolusMagnus
 
Ontology for Knowledge and Data Strategies.pptx
Ontology for Knowledge and Data Strategies.pptxOntology for Knowledge and Data Strategies.pptx
Ontology for Knowledge and Data Strategies.pptxMike Bennett
 
Start making sense - sustainable organising principles
Start making sense - sustainable organising principlesStart making sense - sustainable organising principles
Start making sense - sustainable organising principlesFredric Landqvist
 
Macbeth Hubris Essay. Online assignment writing service.
Macbeth Hubris Essay. Online assignment writing service.Macbeth Hubris Essay. Online assignment writing service.
Macbeth Hubris Essay. Online assignment writing service.Lisa Davis
 
Leonardo Da Vinci Essay
Leonardo Da Vinci EssayLeonardo Da Vinci Essay
Leonardo Da Vinci EssayKaty Shaw
 
Linked Open Data case study (illegal newspapers WW2, Wikipedia, DBpedia) - Le...
Linked Open Data case study (illegal newspapers WW2, Wikipedia, DBpedia) - Le...Linked Open Data case study (illegal newspapers WW2, Wikipedia, DBpedia) - Le...
Linked Open Data case study (illegal newspapers WW2, Wikipedia, DBpedia) - Le...Olaf Janssen
 
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014eswcsummerschool
 
Establishing conclusive proof in Forensic Data Analytics
Establishing conclusive proof in Forensic Data AnalyticsEstablishing conclusive proof in Forensic Data Analytics
Establishing conclusive proof in Forensic Data AnalyticsGabriel Hopmans
 
Salesforce NPO User Group 5-25-12
Salesforce NPO User Group 5-25-12Salesforce NPO User Group 5-25-12
Salesforce NPO User Group 5-25-12Idealist Consulting
 

Similar a Entity Typing and Event Extraction (11)

Ce leeds presentation for delivery (short version)
Ce leeds presentation for delivery (short version)Ce leeds presentation for delivery (short version)
Ce leeds presentation for delivery (short version)
 
Ontology for Knowledge and Data Strategies.pptx
Ontology for Knowledge and Data Strategies.pptxOntology for Knowledge and Data Strategies.pptx
Ontology for Knowledge and Data Strategies.pptx
 
Start making sense - sustainable organising principles
Start making sense - sustainable organising principlesStart making sense - sustainable organising principles
Start making sense - sustainable organising principles
 
Macbeth Hubris Essay. Online assignment writing service.
Macbeth Hubris Essay. Online assignment writing service.Macbeth Hubris Essay. Online assignment writing service.
Macbeth Hubris Essay. Online assignment writing service.
 
Leonardo Da Vinci Essay
Leonardo Da Vinci EssayLeonardo Da Vinci Essay
Leonardo Da Vinci Essay
 
Linked Open Data case study (illegal newspapers WW2, Wikipedia, DBpedia) - Le...
Linked Open Data case study (illegal newspapers WW2, Wikipedia, DBpedia) - Le...Linked Open Data case study (illegal newspapers WW2, Wikipedia, DBpedia) - Le...
Linked Open Data case study (illegal newspapers WW2, Wikipedia, DBpedia) - Le...
 
Sw4 sh slides
Sw4 sh slidesSw4 sh slides
Sw4 sh slides
 
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
Keynote: Global Media Monitoring - M. Grobelnik - ESWC SS 2014
 
Paramount Essays
Paramount EssaysParamount Essays
Paramount Essays
 
Establishing conclusive proof in Forensic Data Analytics
Establishing conclusive proof in Forensic Data AnalyticsEstablishing conclusive proof in Forensic Data Analytics
Establishing conclusive proof in Forensic Data Analytics
 
Salesforce NPO User Group 5-25-12
Salesforce NPO User Group 5-25-12Salesforce NPO User Group 5-25-12
Salesforce NPO User Group 5-25-12
 

Más de Marieke van Erp

Towards Culturally Aware AI Systems - TSDH Symposium
Towards Culturally Aware AI Systems - TSDH SymposiumTowards Culturally Aware AI Systems - TSDH Symposium
Towards Culturally Aware AI Systems - TSDH SymposiumMarieke van Erp
 
A Polyvocal and Contextualised Semantic Web
A Polyvocal and Contextualised Semantic WebA Polyvocal and Contextualised Semantic Web
A Polyvocal and Contextualised Semantic WebMarieke van Erp
 
AI x Digital Humanities = > Inclusiviteit
AI x Digital Humanities = > Inclusiviteit AI x Digital Humanities = > Inclusiviteit
AI x Digital Humanities = > Inclusiviteit Marieke van Erp
 
Computationally Tracing Concepts Through Time and Space
Computationally Tracing Concepts Through Time and SpaceComputationally Tracing Concepts Through Time and Space
Computationally Tracing Concepts Through Time and SpaceMarieke van Erp
 
The Hitchhiker's Guide to the Future of Digital Humanities
The Hitchhiker's Guide to the Future of Digital HumanitiesThe Hitchhiker's Guide to the Future of Digital Humanities
The Hitchhiker's Guide to the Future of Digital HumanitiesMarieke van Erp
 
Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)Marieke van Erp
 
(Beyond) Combining Text and Tables for qualitative and quantitative research
(Beyond) Combining Text and Tables for qualitative and quantitative research (Beyond) Combining Text and Tables for qualitative and quantitative research
(Beyond) Combining Text and Tables for qualitative and quantitative research Marieke van Erp
 
Finding common ground between text, maps, and tables for quantitative and qua...
Finding common ground between text, maps, and tables for quantitative and qua...Finding common ground between text, maps, and tables for quantitative and qua...
Finding common ground between text, maps, and tables for quantitative and qua...Marieke van Erp
 
Slicing and Dicing a Newspaper Corpus for Historical Ecology Research
Slicing and Dicing a Newspaper Corpus for Historical Ecology ResearchSlicing and Dicing a Newspaper Corpus for Historical Ecology Research
Slicing and Dicing a Newspaper Corpus for Historical Ecology ResearchMarieke van Erp
 
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...Marieke van Erp
 
Good Lynx, bad Lynx: Document enrichment for historical ecologists
Good Lynx, bad Lynx: Document enrichment for historical ecologistsGood Lynx, bad Lynx: Document enrichment for historical ecologists
Good Lynx, bad Lynx: Document enrichment for historical ecologistsMarieke van Erp
 
Towards Semantic Enrichment of Newspapers: a historical ecology use case
Towards Semantic Enrichment of Newspapers: a historical ecology use case Towards Semantic Enrichment of Newspapers: a historical ecology use case
Towards Semantic Enrichment of Newspapers: a historical ecology use case Marieke van Erp
 
Natural Language Processing en Named Entity Recognition
Natural Language Processing en Named Entity Recognition Natural Language Processing en Named Entity Recognition
Natural Language Processing en Named Entity Recognition Marieke van Erp
 
HuC lecture - Digital and Humanities: Continuing the Conversation
HuC lecture - Digital and Humanities: Continuing the ConversationHuC lecture - Digital and Humanities: Continuing the Conversation
HuC lecture - Digital and Humanities: Continuing the ConversationMarieke van Erp
 
Multilingual Fine-grained Entity Typing
Multilingual Fine-grained Entity Typing Multilingual Fine-grained Entity Typing
Multilingual Fine-grained Entity Typing Marieke van Erp
 
Entity Typing Using Distributional Semantics and DBpedia
Entity Typing Using Distributional Semantics and DBpedia Entity Typing Using Distributional Semantics and DBpedia
Entity Typing Using Distributional Semantics and DBpedia Marieke van Erp
 
The domain as unifier, how focusing on social history can bring technical fie...
The domain as unifier, how focusing on social history can bring technical fie...The domain as unifier, how focusing on social history can bring technical fie...
The domain as unifier, how focusing on social history can bring technical fie...Marieke van Erp
 
Evaluating entity linking an analysis of current benchmark datasets and a ro...
Evaluating entity linking  an analysis of current benchmark datasets and a ro...Evaluating entity linking  an analysis of current benchmark datasets and a ro...
Evaluating entity linking an analysis of current benchmark datasets and a ro...Marieke van Erp
 
Finding Stories in 1,784,532 Events: Scaling up computational models of narr...
Finding Stories in 1,784,532 Events:  Scaling up computational models of narr...Finding Stories in 1,784,532 Events:  Scaling up computational models of narr...
Finding Stories in 1,784,532 Events: Scaling up computational models of narr...Marieke van Erp
 
Evaluating Named Entity Recognition and Disambiguation in News and Tweets
Evaluating Named Entity Recognition and Disambiguation in News and TweetsEvaluating Named Entity Recognition and Disambiguation in News and Tweets
Evaluating Named Entity Recognition and Disambiguation in News and TweetsMarieke van Erp
 

Más de Marieke van Erp (20)

Towards Culturally Aware AI Systems - TSDH Symposium
Towards Culturally Aware AI Systems - TSDH SymposiumTowards Culturally Aware AI Systems - TSDH Symposium
Towards Culturally Aware AI Systems - TSDH Symposium
 
A Polyvocal and Contextualised Semantic Web
A Polyvocal and Contextualised Semantic WebA Polyvocal and Contextualised Semantic Web
A Polyvocal and Contextualised Semantic Web
 
AI x Digital Humanities = > Inclusiviteit
AI x Digital Humanities = > Inclusiviteit AI x Digital Humanities = > Inclusiviteit
AI x Digital Humanities = > Inclusiviteit
 
Computationally Tracing Concepts Through Time and Space
Computationally Tracing Concepts Through Time and SpaceComputationally Tracing Concepts Through Time and Space
Computationally Tracing Concepts Through Time and Space
 
The Hitchhiker's Guide to the Future of Digital Humanities
The Hitchhiker's Guide to the Future of Digital HumanitiesThe Hitchhiker's Guide to the Future of Digital Humanities
The Hitchhiker's Guide to the Future of Digital Humanities
 
Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)Why language technology can’t handle Game of Thrones (yet)
Why language technology can’t handle Game of Thrones (yet)
 
(Beyond) Combining Text and Tables for qualitative and quantitative research
(Beyond) Combining Text and Tables for qualitative and quantitative research (Beyond) Combining Text and Tables for qualitative and quantitative research
(Beyond) Combining Text and Tables for qualitative and quantitative research
 
Finding common ground between text, maps, and tables for quantitative and qua...
Finding common ground between text, maps, and tables for quantitative and qua...Finding common ground between text, maps, and tables for quantitative and qua...
Finding common ground between text, maps, and tables for quantitative and qua...
 
Slicing and Dicing a Newspaper Corpus for Historical Ecology Research
Slicing and Dicing a Newspaper Corpus for Historical Ecology ResearchSlicing and Dicing a Newspaper Corpus for Historical Ecology Research
Slicing and Dicing a Newspaper Corpus for Historical Ecology Research
 
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...
Lessons Learnt from the Named Entity rEcognition and Linking (NEEL) Challenge...
 
Good Lynx, bad Lynx: Document enrichment for historical ecologists
Good Lynx, bad Lynx: Document enrichment for historical ecologistsGood Lynx, bad Lynx: Document enrichment for historical ecologists
Good Lynx, bad Lynx: Document enrichment for historical ecologists
 
Towards Semantic Enrichment of Newspapers: a historical ecology use case
Towards Semantic Enrichment of Newspapers: a historical ecology use case Towards Semantic Enrichment of Newspapers: a historical ecology use case
Towards Semantic Enrichment of Newspapers: a historical ecology use case
 
Natural Language Processing en Named Entity Recognition
Natural Language Processing en Named Entity Recognition Natural Language Processing en Named Entity Recognition
Natural Language Processing en Named Entity Recognition
 
HuC lecture - Digital and Humanities: Continuing the Conversation
HuC lecture - Digital and Humanities: Continuing the ConversationHuC lecture - Digital and Humanities: Continuing the Conversation
HuC lecture - Digital and Humanities: Continuing the Conversation
 
Multilingual Fine-grained Entity Typing
Multilingual Fine-grained Entity Typing Multilingual Fine-grained Entity Typing
Multilingual Fine-grained Entity Typing
 
Entity Typing Using Distributional Semantics and DBpedia
Entity Typing Using Distributional Semantics and DBpedia Entity Typing Using Distributional Semantics and DBpedia
Entity Typing Using Distributional Semantics and DBpedia
 
The domain as unifier, how focusing on social history can bring technical fie...
The domain as unifier, how focusing on social history can bring technical fie...The domain as unifier, how focusing on social history can bring technical fie...
The domain as unifier, how focusing on social history can bring technical fie...
 
Evaluating entity linking an analysis of current benchmark datasets and a ro...
Evaluating entity linking  an analysis of current benchmark datasets and a ro...Evaluating entity linking  an analysis of current benchmark datasets and a ro...
Evaluating entity linking an analysis of current benchmark datasets and a ro...
 
Finding Stories in 1,784,532 Events: Scaling up computational models of narr...
Finding Stories in 1,784,532 Events:  Scaling up computational models of narr...Finding Stories in 1,784,532 Events:  Scaling up computational models of narr...
Finding Stories in 1,784,532 Events: Scaling up computational models of narr...
 
Evaluating Named Entity Recognition and Disambiguation in News and Tweets
Evaluating Named Entity Recognition and Disambiguation in News and TweetsEvaluating Named Entity Recognition and Disambiguation in News and Tweets
Evaluating Named Entity Recognition and Disambiguation in News and Tweets
 

Último

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 

Último (20)

TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 

Entity Typing and Event Extraction

  • 1. Entity Typing and Event Extraction Marieke van Erp http://mariekevanerp.com
  • 2. VU WP3 team Isa Maks Antske Fokkens Marieke van Erp Piek Vossen
  • 4. What is entity typing? • Entity typing is the task of classifying an entity mention • An entity mention is a recognised name in a text that refers to a real world person, location, organisation or other interesting ‘thing’
  • 5. What is the added value of entity typing? • It allows you to query for fine-grained entity types: give me all electricians in the dataset, give me all historic buildings • Entity typing often includes linking an entity to background knowledge • The background knowledge provides additional filters: give me all politicians born after 1900 in the dataset • Caveat: the background knowledge is not complete
  • 6.
  • 7.
  • 8.
  • 9. New synonym/concept lists are easy to plug in
  • 10. New synonym/concept lists are easy to plug in
  • 11.
  • 12. Brouwers: concept-100350 (ponstypiste) isRelatedTo class-Schrijfkunst concept-100343 (tachygraaf) isRelatedTo class-Schrijfkunst concept-100313 (schrijver) isRelatedTo class-Schrijfkunst
  • 13. Brouwers: concept-100350 (ponstypiste) isRelatedTo class-Schrijfkunst concept-100343 (tachygraaf) isRelatedTo class-Schrijfkunst concept-100313 (schrijver) isRelatedTo class-Schrijfkunst
  • 14. Brouwers: concept-100350 (ponstypiste) isRelatedTo class-Schrijfkunst concept-100343 (tachygraaf) isRelatedTo class-Schrijfkunst concept-100313 (schrijver) isRelatedTo class-Schrijfkunst
  • 15. Named Entity Recognition & Linking • We are creating links between HISCO and Brouwers • We are building on entity and concept linkers that can recognise concepts from HISCO and Brouwers in texts • We are developing a new general purpose entity linker that allows for use of datasets other than DBpedia and is less sensitive to general entity popularity • Discovering more about Dark and NIL entities is also ongoing work (cf. Van Erp & Vossen (2016) Entity Typing using Distributional Semantics and DBpedia. To appear in: Proceedings of the 4th NLP&DBpedia workshop. Kobe, Japan 18 October 2016)
  • 17. Event Extraction • Event Extraction is the task of recognising and classifying mentions of ‘things that happen’ in text • Events are multifaceted: they take place at a certain time and place and have participants involved • By recognising participants, times and places, we can generate event descriptions and compare events
  • 18. From words to concepts • Linking terms to synonyms to obtain a higher level of abstraction • Word-sense disambiguation + WordNet + Multilingual Central Repository + Framenet + PropBank • Stop, quit, leave, relinquish, bow out -> all linked to the concept wn:leave_office
  • 19. Why link to WordNet/ConceptNet/etc? • It allows you to query for types rather than instances: give me all lawsuits in the dataset • In the context of CLARIAH, we are converting various diachronous lexicons to Linked Data • integrate resources • tag interesting concepts in text • query expansion
  • 20. Semantic Role Labelling • Detecting the agent, patient, recipient and theme of a sentence • Mary sold the book to John • Agent: Mary • Recipient: John • Theme: the book
  • 21. Event12 buy/sell fn:Seller fn:Commerce_money_transfer fn:Goods fn:Money fn:Buyer dbp:Porsche_fa mily dbp:QatarHolding ?Entity23 10% stake type Qatar Holding sells 10% stake in Porsche to founding families Porsche family buys back 10pc stake from Qatar http://english.alarabiya.net http://www.telegraph.co.uk 2013-06-17 sem:hasTime 2013-06-17
  • 22.
  • 23. Event abstractions • Enable searches such as: Give me all lawsuits in which a politician was involved between 1990 and 2000. • Current developments: expand resources to the historic domain, devise new crystallisation strategies for aggregating event information
  • 24. Find out more • All modules and evaluations are described in: http:// kyoto.let.vu.nl/newsreader_deliverables/NWR-D4-2-3.pdf (158 pages!) • Selection to be adapted within CLARIAH: https:// github.com/CLARIAH/wp3-semantic-parsing-Dutch • New developments: http://www.clariah.nl & https:// github.com/clariah
  • 25. Discussion • It’s research software (no fancy interface) • Currently not adapted to deal with old spelling variants/OCR/ etc • NLP isn’t perfect (but humans don’t always agree either!) • What would it take for you to start using such tools? • What types of analyses are most interesting to the community? • What use cases are most useful to the community at this point in time?