SlideShare a Scribd company logo
1 of 30
Mapping Tweets to Conference Talks: A
Goldmine for Semantics
Milan Stankovic, Hypios, Paris-Sorbonne, FR & Matthew Rowe, KMI, Open University, UK
On Conference We Tweet
Is there a Correspondance?
?
Why?
tweettweet talktalk
is about
Why?
tweettweet talktalk
is about
Topic 3
Topic 2
Topic 1
has topic
has topic
has topic
useruser
made
Why?
tweettweet talktalk
is about
Topic 3
Topic 2
Topic 1
has topic
has topic
has topic
useruser
made
interest ?
Why?
tweettweet talktalk
is about
useruser
made
were at the same talk ?
tweettweet
is about
useruser
made
Potential Benefits
• Digital memory
• Conference feedback
– number of tweets for a talk
– conversational aspects
– sentiment analysis
• User profiling and expert finding
• Trending topics
Rich Activity Twitter Event Data
• We take Twitter archives from
TwapperKeeper
• We enrich Tweets with relevant DBPedia
concepts using Zemanta
• We rely on existing Linked Data about talks to
perform the mappings.
ESWC Dataset
• Collected during the Extended Semantic Web
Conference 2010
– Any tweets tagged with “eswc”
• 1082 tweets
• 213 tweets enriched with concepts
Aligning Tweets with Talks
• Goal: Label tweets with talks
• Method:
– Induce a labelling function to perform alignment
– Labelled data = events from Web of Data
– Unlabelled data = tweets
( ){ }L
iii yx 1
, =
( ){ }U
iix 1=
YXf →:
Aligning Tweets with Talks
1. Feature Extraction:
@prefix swrc: <http://swrc.ontoware.org/ontology#>
@prefix swc: <http://data.semanticweb.org/ns/swc/ontology#>
@prefix dog: <http://data.semanticweb.org>
@prefix dc: <http://purl.org/dc/elements/1.1/>
<http://data.semanticweb.org/conference/eswc/2010/paper/phd_symposium/23> rdf:type swrc:InProceedings ;
dc:subject "Knowledge Acquisition" ;
dc:subject "Semantic Analysis" ;
dc:subject "Social Web" ;
dc:subject "Microblogs" ;
dc:title "Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams" ;
swrc:abstract "Although one might argue that little wisdom can be conveyed in messages of 140 ..." ;
swrc:author <http://data.semanticweb.org/person/claudia-wagner> .
<http://data.semanticweb.org/person/claudia-wagner> rdf:type foaf:Person ;
foaf:name "Claudia Wagner" ;
swrc:affiliation <http://data.semanticweb.org/organization/joanneum-research> ;
foaf:based_near <http://dbpedia.org/resource/Austria>
Aligning Tweets with Talks
1. Feature Extraction: F1 - Immediate Resource
Leaves
@prefix swrc: <http://swrc.ontoware.org/ontology#>
@prefix swc: <http://data.semanticweb.org/ns/swc/ontology#>
@prefix dog: <http://data.semanticweb.org>
@prefix dc: <http://purl.org/dc/elements/1.1/>
<http://data.semanticweb.org/conference/eswc/2010/paper/phd_symposium/23> rdf:type swrc:InProceedings ;
dc:subject "Knowledge Acquisition" ;
dc:subject "Semantic Analysis" ;
dc:subject "Social Web" ;
dc:subject "Microblogs" ;
dc:title "Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams" ;
swrc:abstract "Although one might argue that little wisdom can be conveyed in messages of 140 ..." ;
swrc:author <http://data.semanticweb.org/person/claudia-wagner> .
<http://data.semanticweb.org/person/claudia-wagner> rdf:type foaf:Person ;
foaf:name "Claudia Wagner" ;
swrc:affiliation <http://data.semanticweb.org/organization/joanneum-research> ;
foaf:based_near <http://dbpedia.org/resource/Austria>
Knowledge Acquisition Semantic
Analysis Social Web Microblogs
Exploring the Wisdom of the Tweets:
Knowledge Acquisition from Social
Awareness Streams Although one might
argue that little wisdom can be
conveyed in messages of 140
http://data.semanticweb.org/person/cla
udia-wagner
Knowledge Acquisition Semantic
Analysis Social Web Microblogs
Exploring the Wisdom of the Tweets:
Knowledge Acquisition from Social
Awareness Streams Although one might
argue that little wisdom can be
conveyed in messages of 140
http://data.semanticweb.org/person/cla
udia-wagner
Aligning Tweets with Talks
1. Feature Extraction: F2 – 1-step Resource
Leaves
@prefix swrc: <http://swrc.ontoware.org/ontology#>
@prefix swc: <http://data.semanticweb.org/ns/swc/ontology#>
@prefix dog: <http://data.semanticweb.org>
@prefix dc: <http://purl.org/dc/elements/1.1/>
<http://data.semanticweb.org/conference/eswc/2010/paper/phd_symposium/23> rdf:type swrc:InProceedings ;
dc:subject "Knowledge Acquisition" ;
dc:subject "Semantic Analysis" ;
dc:subject "Social Web" ;
dc:subject "Microblogs" ;
dc:title "Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams" ;
swrc:abstract "Although one might argue that little wisdom can be conveyed in messages of 140 ..." ;
swrc:author <http://data.semanticweb.org/person/claudia-wagner> .
<http://data.semanticweb.org/person/claudia-wagner> rdf:type foaf:Person ;
foaf:name "Claudia Wagner" ;
swrc:affiliation <http://data.semanticweb.org/organization/joanneum-research> ;
foaf:based_near <http://dbpedia.org/resource/Austria>
http://data.semanticweb.org/person/cla
udia-wagner Claudia Wagner
http://data.semanticweb.org/organizati
on/joanneum-research
http://dbpedia.org/resource/Austria
http://data.semanticweb.org/person/cla
udia-wagner Claudia Wagner
http://data.semanticweb.org/organizati
on/joanneum-research
http://dbpedia.org/resource/Austria
Aligning Tweets with Talks
1. Feature Extraction: F3 – DBPedia Concepts
@prefix swrc: <http://swrc.ontoware.org/ontology#>
@prefix swc: <http://data.semanticweb.org/ns/swc/ontology#>
@prefix dog: <http://data.semanticweb.org>
@prefix dc: <http://purl.org/dc/elements/1.1/>
<http://data.semanticweb.org/conference/eswc/2010/paper/phd_symposium/23> rdf:type swrc:InProceedings ;
dc:subject "Knowledge Acquisition" ;
dc:subject "Semantic Analysis" ;
dc:subject "Social Web" ;
dc:subject "Microblogs" ;
dc:title "Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams" ;
swrc:abstract "Although one might argue that little wisdom can be conveyed in messages of 140 ..." ;
swrc:author <http://data.semanticweb.org/person/claudia-wagner> .
<http://data.semanticweb.org/person/claudia-wagner> rdf:type foaf:Person ;
foaf:name "Claudia Wagner" ;
swrc:affiliation <http://data.semanticweb.org/organization/joanneum-research> ;
foaf:based_near <http://dbpedia.org/resource/Austria>
Http://dbpedia.org/resource/TwitterHttp://dbpedia.org/resource/Twitter
Http://dbpedia.org/resource/Social_WebHttp://dbpedia.org/resource/Social_Web
Http://dbpedia.org/resource/MicroblogsHttp://dbpedia.org/resource/Microblogs
Aligning Tweets with Talks
2. Feature Vector Composition
Knowledge Acquisition Semantic
Analysis Social Web Microblogs
Exploring the Wisdom of the Tweets:
Knowledge Acquisition from Social
Awareness Streams Although one might
argue that little wisdom can be
conveyed in messages of 140
http://data.semanticweb.org/person/cla
udia-wagner
Knowledge Acquisition Semantic
Analysis Social Web Microblogs
Exploring the Wisdom of the Tweets:
Knowledge Acquisition from Social
Awareness Streams Although one might
argue that little wisdom can be
conveyed in messages of 140
http://data.semanticweb.org/person/cla
udia-wagner
knowledge
acquisition
semantic
analysis
social
web
microblogs
exploring
wisdom
tweets
knowledge
acquisition
social
awareness
streams
wisdom
messages
IndexerIndexer
knowledge 2
acquisition 2
semantic 1
analysis 1
social 2
web 1
microblogs 1
exploring 1
wisdom 1
tweets 1
awareness 1
streams 1
wisdom 1
messages 1
Aligning Tweets with Talks
3. Inducing the Labelling Function
– Both tweets and events are provided as feature
vectors
– Induce a labelling function:
Choose the most likely event (y) given the tweet (x)
YXf →:
Aligning Tweets with Talks
3. Inducing the Labelling Function: Proximity-
based Clustering
– Build a centroid vector for each event
• From event feature vectors
– Compare each tweet vector with each centroid
• Choose event (y) which is closest
)),((minarg y
Yy
xdy µ
∈
=
∑=
−=
n
i
iixxmanhat
1
),( µµ ( )
2
1
),( ∑=
−=
n
i
iixxeucl µµ
Aligning Tweets with Talks
3. Inducing the Labelling Function: Naive Bayes
Classification
– Assigns most probably event label given tweet
features
– Using Bayes Theorem, we write this as:
),,,|( 21maxarg n
Yy
xxxyPy 
∈
=
∏
∈
∈
∈
=
=
=
i
i
Yy
n
Yy
n
n
Yy
yxPyPy
yPyxxxPy
xxxP
yPyxxxP
y
)|()(
)()|,,,(
),,,(
)()|,,,(
maxarg
maxarg
maxarg
21
21
21



Experiments
• Dataset
– Corpus of Tweets collected during ESWC 2010
• Gold Standard Construction
– Used 3 raters to label a portion of tweet corpus
• 200 tweets labelled
– Took interrater agreement between raters
• Using Kappa statistic
– Initial Agreement was too low: 0.328
– Utilised Delphi method to improve agreement
– Second round of labelling produced: 0.820
Experiments
• Evaluation Measures
– Precision: proportion of event tweets correctly
labelled
– Recall: proportion of tweets successfully
returned for a tweet
– F-measure: Harmonic mean of precision and
recall
• Placed emphasis of precision over recall
RP
RP
measuref
+×
××+
=− 2
2
)1(
β
β
{ }1,5.0,2.0=β
Results
Imagine…
Imagine user profiling
ESWC dataset, user Matthew Rowe
Imagine conference feedback
ESWC dataset
directly from Tweets
from mappings (Talks)
We Challenge You
We Challenge You!
• Beat us in mappings!
• We provide the human generated gold
stadnard mappings
• Can you find a more precise way to do tweet-
talk mappings?
• Can you find other uses? Let us know!
We Challenge You!
• you can find the gold standard data here :
http://research.hypios.com/?page_id=131
• you can find all the data (and automated
mappings) here:
http://data.hypios.com/tweets/sparql
We Challenge You!
http://data.hypios.com/tweets/sparql
SELECT ?tweet ?talk WHERE {
?tweet <http://linkedevents.org/ontology/illustrate> ?talk.
}
brought to you by
milan.stankovic@hypios.com & M.C.Rowe@open.ac.uk
November 2010, Shanghaï, China

More Related Content

What's hot

What's hot (9)

Threat Hunting with Splunk
Threat Hunting with Splunk Threat Hunting with Splunk
Threat Hunting with Splunk
 
Supraja_SMS_presentation
Supraja_SMS_presentationSupraja_SMS_presentation
Supraja_SMS_presentation
 
[HITCON 2020 CTI Village] Threat Hunting and Campaign Tracking Workshop.pptx
[HITCON 2020 CTI Village] Threat Hunting and Campaign Tracking Workshop.pptx[HITCON 2020 CTI Village] Threat Hunting and Campaign Tracking Workshop.pptx
[HITCON 2020 CTI Village] Threat Hunting and Campaign Tracking Workshop.pptx
 
Infrastructure Tracking with Passive Monitoring and Active Probing: ShmooCon ...
Infrastructure Tracking with Passive Monitoring and Active Probing: ShmooCon ...Infrastructure Tracking with Passive Monitoring and Active Probing: ShmooCon ...
Infrastructure Tracking with Passive Monitoring and Active Probing: ShmooCon ...
 
Tcpdump hunter
Tcpdump hunterTcpdump hunter
Tcpdump hunter
 
OSINT tools for security auditing with python
OSINT tools for security auditing with pythonOSINT tools for security auditing with python
OSINT tools for security auditing with python
 
BDACA1516s2 - Lecture5
BDACA1516s2 - Lecture5BDACA1516s2 - Lecture5
BDACA1516s2 - Lecture5
 
Malware Static Analysis
Malware Static AnalysisMalware Static Analysis
Malware Static Analysis
 
Early Detection of Malicious Flux Networks via Large Scale Passive DNS Traffi...
Early Detection of Malicious Flux Networks via Large Scale Passive DNS Traffi...Early Detection of Malicious Flux Networks via Large Scale Passive DNS Traffi...
Early Detection of Malicious Flux Networks via Large Scale Passive DNS Traffi...
 

Viewers also liked

Envoy Presentation Full Length Plus Retail Process
Envoy Presentation Full Length Plus Retail ProcessEnvoy Presentation Full Length Plus Retail Process
Envoy Presentation Full Length Plus Retail Process
wijrwsr
 
Rabies Virus
Rabies VirusRabies Virus
Rabies Virus
Dikshan
 
MAD ./ \. DOG
MAD ./ \. DOGMAD ./ \. DOG
MAD ./ \. DOG
Dikshan
 
Evaluation Slide Show
Evaluation Slide ShowEvaluation Slide Show
Evaluation Slide Show
gemmibearrox
 
rabies 2
rabies 2rabies 2
rabies 2
Dikshan
 
Envoy Presentation Full Length Plus Retail Process
Envoy Presentation Full Length Plus Retail ProcessEnvoy Presentation Full Length Plus Retail Process
Envoy Presentation Full Length Plus Retail Process
wijrwsr
 
FOAF+SSL: More then Authentication
FOAF+SSL: More then AuthenticationFOAF+SSL: More then Authentication
FOAF+SSL: More then Authentication
Milan Stankovic
 
Finding Co-solvers on Twitter, with the Little Help from Linked Data
Finding Co-solvers on Twitter, with the Little Help from Linked DataFinding Co-solvers on Twitter, with the Little Help from Linked Data
Finding Co-solvers on Twitter, with the Little Help from Linked Data
Milan Stankovic
 
Translate Subtitles
Translate SubtitlesTranslate Subtitles
Translate Subtitles
guest78ba8c
 
How does your media project represent particualr social groups?
How does your media project represent particualr social groups?How does your media project represent particualr social groups?
How does your media project represent particualr social groups?
gemmibearrox
 
Accessibility U 1237927961698 S U
Accessibility U 1237927961698 S UAccessibility U 1237927961698 S U
Accessibility U 1237927961698 S U
guest45d56
 
Gtdessentialsles320090324.Key U 1237929710943 B U
Gtdessentialsles320090324.Key U 1237929710943 B UGtdessentialsles320090324.Key U 1237929710943 B U
Gtdessentialsles320090324.Key U 1237929710943 B U
guest45d56
 

Viewers also liked (20)

gs0703
gs0703gs0703
gs0703
 
Istc 655 Chapter 7 Ppt
Istc 655 Chapter 7 PptIstc 655 Chapter 7 Ppt
Istc 655 Chapter 7 Ppt
 
Envoy Presentation Full Length Plus Retail Process
Envoy Presentation Full Length Plus Retail ProcessEnvoy Presentation Full Length Plus Retail Process
Envoy Presentation Full Length Plus Retail Process
 
Rabies Virus
Rabies VirusRabies Virus
Rabies Virus
 
MAD ./ \. DOG
MAD ./ \. DOGMAD ./ \. DOG
MAD ./ \. DOG
 
Evaluation Slide Show
Evaluation Slide ShowEvaluation Slide Show
Evaluation Slide Show
 
rabies
rabiesrabies
rabies
 
rabies 2
rabies 2rabies 2
rabies 2
 
Envoy Presentation Full Length Plus Retail Process
Envoy Presentation Full Length Plus Retail ProcessEnvoy Presentation Full Length Plus Retail Process
Envoy Presentation Full Length Plus Retail Process
 
FOAF+SSL: More then Authentication
FOAF+SSL: More then AuthenticationFOAF+SSL: More then Authentication
FOAF+SSL: More then Authentication
 
Finding Co-solvers on Twitter, with the Little Help from Linked Data
Finding Co-solvers on Twitter, with the Little Help from Linked DataFinding Co-solvers on Twitter, with the Little Help from Linked Data
Finding Co-solvers on Twitter, with the Little Help from Linked Data
 
Open Innovation and Semantic Web
Open Innovation and Semantic WebOpen Innovation and Semantic Web
Open Innovation and Semantic Web
 
Semantic Web In Practice
Semantic Web In PracticeSemantic Web In Practice
Semantic Web In Practice
 
Faceted Online Presence
Faceted Online PresenceFaceted Online Presence
Faceted Online Presence
 
Looking for Experts? What can Linked Data do for You?
Looking for Experts? What can Linked Data do for You?Looking for Experts? What can Linked Data do for You?
Looking for Experts? What can Linked Data do for You?
 
Translate Subtitles
Translate SubtitlesTranslate Subtitles
Translate Subtitles
 
Online Presence
Online PresenceOnline Presence
Online Presence
 
How does your media project represent particualr social groups?
How does your media project represent particualr social groups?How does your media project represent particualr social groups?
How does your media project represent particualr social groups?
 
Accessibility U 1237927961698 S U
Accessibility U 1237927961698 S UAccessibility U 1237927961698 S U
Accessibility U 1237927961698 S U
 
Gtdessentialsles320090324.Key U 1237929710943 B U
Gtdessentialsles320090324.Key U 1237929710943 B UGtdessentialsles320090324.Key U 1237929710943 B U
Gtdessentialsles320090324.Key U 1237929710943 B U
 

Similar to Mapping Tweets to Conference Talks: A Goldmine for Semantics

Modelling the Web: Examples of Modelling Text, Knowledge Networks and Physica...
Modelling the Web: Examples of Modelling Text, Knowledge Networks and Physica...Modelling the Web: Examples of Modelling Text, Knowledge Networks and Physica...
Modelling the Web: Examples of Modelling Text, Knowledge Networks and Physica...
Steffen Staab
 

Similar to Mapping Tweets to Conference Talks: A Goldmine for Semantics (20)

apidays LIVE Australia 2021 - Tracing across your distributed process boundar...
apidays LIVE Australia 2021 - Tracing across your distributed process boundar...apidays LIVE Australia 2021 - Tracing across your distributed process boundar...
apidays LIVE Australia 2021 - Tracing across your distributed process boundar...
 
Facilitating Data Curation: a Solution Developed in the Toxicology Domain
Facilitating Data Curation: a Solution Developed in the Toxicology DomainFacilitating Data Curation: a Solution Developed in the Toxicology Domain
Facilitating Data Curation: a Solution Developed in the Toxicology Domain
 
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSABetter Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
Better Hackathon 2020 - Fraunhofer IAIS - Semantic geo-clustering with SANSA
 
Patterns of Streaming Applications
Patterns of Streaming ApplicationsPatterns of Streaming Applications
Patterns of Streaming Applications
 
Semantic Support for Complex Ecosystem Research Environments
Semantic Support for Complex Ecosystem Research EnvironmentsSemantic Support for Complex Ecosystem Research Environments
Semantic Support for Complex Ecosystem Research Environments
 
Unleashing Twitter Data for Fun and Insight
Unleashing Twitter Data for Fun and InsightUnleashing Twitter Data for Fun and Insight
Unleashing Twitter Data for Fun and Insight
 
Unleashing twitter data for fun and insight
Unleashing twitter data for fun and insightUnleashing twitter data for fun and insight
Unleashing twitter data for fun and insight
 
Apache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignApache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - Verisign
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
 
Information Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ DeloitteInformation Extraction from Text, presented @ Deloitte
Information Extraction from Text, presented @ Deloitte
 
The nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologiesThe nature.com ontologies portal: nature.com/ontologies
The nature.com ontologies portal: nature.com/ontologies
 
Filtering From the Firehose: Real Time Social Media Streaming
Filtering From the Firehose: Real Time Social Media StreamingFiltering From the Firehose: Real Time Social Media Streaming
Filtering From the Firehose: Real Time Social Media Streaming
 
Cassandra Day Denver 2014: Using Cassandra to Support Crisis Informatics Rese...
Cassandra Day Denver 2014: Using Cassandra to Support Crisis Informatics Rese...Cassandra Day Denver 2014: Using Cassandra to Support Crisis Informatics Rese...
Cassandra Day Denver 2014: Using Cassandra to Support Crisis Informatics Rese...
 
30° Nexa Lunch Seminar - Linked Data Platform vs real world
30° Nexa Lunch Seminar - Linked Data Platform vs real world30° Nexa Lunch Seminar - Linked Data Platform vs real world
30° Nexa Lunch Seminar - Linked Data Platform vs real world
 
AAT LOD Microthesauri
AAT LOD MicrothesauriAAT LOD Microthesauri
AAT LOD Microthesauri
 
Recommending Semantic Nearest Neighbors Using Storm and Dato
Recommending Semantic Nearest Neighbors Using Storm and DatoRecommending Semantic Nearest Neighbors Using Storm and Dato
Recommending Semantic Nearest Neighbors Using Storm and Dato
 
Modelling the Web: Examples of Modelling Text, Knowledge Networks and Physica...
Modelling the Web: Examples of Modelling Text, Knowledge Networks and Physica...Modelling the Web: Examples of Modelling Text, Knowledge Networks and Physica...
Modelling the Web: Examples of Modelling Text, Knowledge Networks and Physica...
 
Rob Davidson at the G3 Workshop: Open Source - Tools for Reproducibility
Rob Davidson at the G3 Workshop: Open Source - Tools for ReproducibilityRob Davidson at the G3 Workshop: Open Source - Tools for Reproducibility
Rob Davidson at the G3 Workshop: Open Source - Tools for Reproducibility
 
Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...
Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...
Data to Insight in a Flash: Introduction to Real-Time Analytics with WSO2 Com...
 
myExperiment @ Nettab
myExperiment @ NettabmyExperiment @ Nettab
myExperiment @ Nettab
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Recently uploaded (20)

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 

Mapping Tweets to Conference Talks: A Goldmine for Semantics

  • 1. Mapping Tweets to Conference Talks: A Goldmine for Semantics Milan Stankovic, Hypios, Paris-Sorbonne, FR & Matthew Rowe, KMI, Open University, UK
  • 3. Is there a Correspondance? ?
  • 5. Why? tweettweet talktalk is about Topic 3 Topic 2 Topic 1 has topic has topic has topic useruser made
  • 6. Why? tweettweet talktalk is about Topic 3 Topic 2 Topic 1 has topic has topic has topic useruser made interest ?
  • 7. Why? tweettweet talktalk is about useruser made were at the same talk ? tweettweet is about useruser made
  • 8. Potential Benefits • Digital memory • Conference feedback – number of tweets for a talk – conversational aspects – sentiment analysis • User profiling and expert finding • Trending topics
  • 9. Rich Activity Twitter Event Data • We take Twitter archives from TwapperKeeper • We enrich Tweets with relevant DBPedia concepts using Zemanta • We rely on existing Linked Data about talks to perform the mappings.
  • 10. ESWC Dataset • Collected during the Extended Semantic Web Conference 2010 – Any tweets tagged with “eswc” • 1082 tweets • 213 tweets enriched with concepts
  • 11. Aligning Tweets with Talks • Goal: Label tweets with talks • Method: – Induce a labelling function to perform alignment – Labelled data = events from Web of Data – Unlabelled data = tweets ( ){ }L iii yx 1 , = ( ){ }U iix 1= YXf →:
  • 12. Aligning Tweets with Talks 1. Feature Extraction: @prefix swrc: <http://swrc.ontoware.org/ontology#> @prefix swc: <http://data.semanticweb.org/ns/swc/ontology#> @prefix dog: <http://data.semanticweb.org> @prefix dc: <http://purl.org/dc/elements/1.1/> <http://data.semanticweb.org/conference/eswc/2010/paper/phd_symposium/23> rdf:type swrc:InProceedings ; dc:subject "Knowledge Acquisition" ; dc:subject "Semantic Analysis" ; dc:subject "Social Web" ; dc:subject "Microblogs" ; dc:title "Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams" ; swrc:abstract "Although one might argue that little wisdom can be conveyed in messages of 140 ..." ; swrc:author <http://data.semanticweb.org/person/claudia-wagner> . <http://data.semanticweb.org/person/claudia-wagner> rdf:type foaf:Person ; foaf:name "Claudia Wagner" ; swrc:affiliation <http://data.semanticweb.org/organization/joanneum-research> ; foaf:based_near <http://dbpedia.org/resource/Austria>
  • 13. Aligning Tweets with Talks 1. Feature Extraction: F1 - Immediate Resource Leaves @prefix swrc: <http://swrc.ontoware.org/ontology#> @prefix swc: <http://data.semanticweb.org/ns/swc/ontology#> @prefix dog: <http://data.semanticweb.org> @prefix dc: <http://purl.org/dc/elements/1.1/> <http://data.semanticweb.org/conference/eswc/2010/paper/phd_symposium/23> rdf:type swrc:InProceedings ; dc:subject "Knowledge Acquisition" ; dc:subject "Semantic Analysis" ; dc:subject "Social Web" ; dc:subject "Microblogs" ; dc:title "Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams" ; swrc:abstract "Although one might argue that little wisdom can be conveyed in messages of 140 ..." ; swrc:author <http://data.semanticweb.org/person/claudia-wagner> . <http://data.semanticweb.org/person/claudia-wagner> rdf:type foaf:Person ; foaf:name "Claudia Wagner" ; swrc:affiliation <http://data.semanticweb.org/organization/joanneum-research> ; foaf:based_near <http://dbpedia.org/resource/Austria> Knowledge Acquisition Semantic Analysis Social Web Microblogs Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams Although one might argue that little wisdom can be conveyed in messages of 140 http://data.semanticweb.org/person/cla udia-wagner Knowledge Acquisition Semantic Analysis Social Web Microblogs Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams Although one might argue that little wisdom can be conveyed in messages of 140 http://data.semanticweb.org/person/cla udia-wagner
  • 14. Aligning Tweets with Talks 1. Feature Extraction: F2 – 1-step Resource Leaves @prefix swrc: <http://swrc.ontoware.org/ontology#> @prefix swc: <http://data.semanticweb.org/ns/swc/ontology#> @prefix dog: <http://data.semanticweb.org> @prefix dc: <http://purl.org/dc/elements/1.1/> <http://data.semanticweb.org/conference/eswc/2010/paper/phd_symposium/23> rdf:type swrc:InProceedings ; dc:subject "Knowledge Acquisition" ; dc:subject "Semantic Analysis" ; dc:subject "Social Web" ; dc:subject "Microblogs" ; dc:title "Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams" ; swrc:abstract "Although one might argue that little wisdom can be conveyed in messages of 140 ..." ; swrc:author <http://data.semanticweb.org/person/claudia-wagner> . <http://data.semanticweb.org/person/claudia-wagner> rdf:type foaf:Person ; foaf:name "Claudia Wagner" ; swrc:affiliation <http://data.semanticweb.org/organization/joanneum-research> ; foaf:based_near <http://dbpedia.org/resource/Austria> http://data.semanticweb.org/person/cla udia-wagner Claudia Wagner http://data.semanticweb.org/organizati on/joanneum-research http://dbpedia.org/resource/Austria http://data.semanticweb.org/person/cla udia-wagner Claudia Wagner http://data.semanticweb.org/organizati on/joanneum-research http://dbpedia.org/resource/Austria
  • 15. Aligning Tweets with Talks 1. Feature Extraction: F3 – DBPedia Concepts @prefix swrc: <http://swrc.ontoware.org/ontology#> @prefix swc: <http://data.semanticweb.org/ns/swc/ontology#> @prefix dog: <http://data.semanticweb.org> @prefix dc: <http://purl.org/dc/elements/1.1/> <http://data.semanticweb.org/conference/eswc/2010/paper/phd_symposium/23> rdf:type swrc:InProceedings ; dc:subject "Knowledge Acquisition" ; dc:subject "Semantic Analysis" ; dc:subject "Social Web" ; dc:subject "Microblogs" ; dc:title "Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams" ; swrc:abstract "Although one might argue that little wisdom can be conveyed in messages of 140 ..." ; swrc:author <http://data.semanticweb.org/person/claudia-wagner> . <http://data.semanticweb.org/person/claudia-wagner> rdf:type foaf:Person ; foaf:name "Claudia Wagner" ; swrc:affiliation <http://data.semanticweb.org/organization/joanneum-research> ; foaf:based_near <http://dbpedia.org/resource/Austria> Http://dbpedia.org/resource/TwitterHttp://dbpedia.org/resource/Twitter Http://dbpedia.org/resource/Social_WebHttp://dbpedia.org/resource/Social_Web Http://dbpedia.org/resource/MicroblogsHttp://dbpedia.org/resource/Microblogs
  • 16. Aligning Tweets with Talks 2. Feature Vector Composition Knowledge Acquisition Semantic Analysis Social Web Microblogs Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams Although one might argue that little wisdom can be conveyed in messages of 140 http://data.semanticweb.org/person/cla udia-wagner Knowledge Acquisition Semantic Analysis Social Web Microblogs Exploring the Wisdom of the Tweets: Knowledge Acquisition from Social Awareness Streams Although one might argue that little wisdom can be conveyed in messages of 140 http://data.semanticweb.org/person/cla udia-wagner knowledge acquisition semantic analysis social web microblogs exploring wisdom tweets knowledge acquisition social awareness streams wisdom messages IndexerIndexer knowledge 2 acquisition 2 semantic 1 analysis 1 social 2 web 1 microblogs 1 exploring 1 wisdom 1 tweets 1 awareness 1 streams 1 wisdom 1 messages 1
  • 17. Aligning Tweets with Talks 3. Inducing the Labelling Function – Both tweets and events are provided as feature vectors – Induce a labelling function: Choose the most likely event (y) given the tweet (x) YXf →:
  • 18. Aligning Tweets with Talks 3. Inducing the Labelling Function: Proximity- based Clustering – Build a centroid vector for each event • From event feature vectors – Compare each tweet vector with each centroid • Choose event (y) which is closest )),((minarg y Yy xdy µ ∈ = ∑= −= n i iixxmanhat 1 ),( µµ ( ) 2 1 ),( ∑= −= n i iixxeucl µµ
  • 19. Aligning Tweets with Talks 3. Inducing the Labelling Function: Naive Bayes Classification – Assigns most probably event label given tweet features – Using Bayes Theorem, we write this as: ),,,|( 21maxarg n Yy xxxyPy  ∈ = ∏ ∈ ∈ ∈ = = = i i Yy n Yy n n Yy yxPyPy yPyxxxPy xxxP yPyxxxP y )|()( )()|,,,( ),,,( )()|,,,( maxarg maxarg maxarg 21 21 21   
  • 20. Experiments • Dataset – Corpus of Tweets collected during ESWC 2010 • Gold Standard Construction – Used 3 raters to label a portion of tweet corpus • 200 tweets labelled – Took interrater agreement between raters • Using Kappa statistic – Initial Agreement was too low: 0.328 – Utilised Delphi method to improve agreement – Second round of labelling produced: 0.820
  • 21. Experiments • Evaluation Measures – Precision: proportion of event tweets correctly labelled – Recall: proportion of tweets successfully returned for a tweet – F-measure: Harmonic mean of precision and recall • Placed emphasis of precision over recall RP RP measuref +× ××+ =− 2 2 )1( β β { }1,5.0,2.0=β
  • 24. Imagine user profiling ESWC dataset, user Matthew Rowe
  • 25. Imagine conference feedback ESWC dataset directly from Tweets from mappings (Talks)
  • 27. We Challenge You! • Beat us in mappings! • We provide the human generated gold stadnard mappings • Can you find a more precise way to do tweet- talk mappings? • Can you find other uses? Let us know!
  • 28. We Challenge You! • you can find the gold standard data here : http://research.hypios.com/?page_id=131 • you can find all the data (and automated mappings) here: http://data.hypios.com/tweets/sparql
  • 29. We Challenge You! http://data.hypios.com/tweets/sparql SELECT ?tweet ?talk WHERE { ?tweet <http://linkedevents.org/ontology/illustrate> ?talk. }
  • 30. brought to you by milan.stankovic@hypios.com & M.C.Rowe@open.ac.uk November 2010, Shanghaï, China