SlideShare una empresa de Scribd logo
1 de 99
Descargar para leer sin conexión
Knowledge-base Enabled Information
Filtering on Social Web
Pavan Kapanipathi
Kno.e.sis Center, Wright State University
Advisor: Amit Sheth
1
Kno.e.sis
2
Social Web in 60 secs
3
Social Web in 60 secs
500M users generate 500M tweets per day
4
Disaster Management Organizations
utilize Social Web
35% of 20M tweets during
hurricane sandy shared information
and news about the disaster 5
Healthcare Issues
6
Healthcare Issues
7
Personalized Filtering on Social Web
Following Dynamically
Evolving Topics as
interests
8
Personalization on Social Web
• Following Dynamically
Evolving Topics
• Indian Elections
• US Elections
• Heathcare Debate
9
Personalization on Social Web
• Following Dynamically
Evolving Topics
• Indian Elections
• US Elections
• Heathcare Debate
10
Dynamic Topics
11
Dynamic Topics
Continuously
Evolving on
Twitter
Entity – Event
relevance
changes
Many entities
are involved
12
Dynamic Topics
Manually crawl using
keywords
“indianelection”“jan25” “sandy”
“swineflu” “ebola”
13
Dynamic Topics
Manually updating
keywords to get topic
relevant tweets is not
feasible
“indianelection”
“modi”
“bjp”
“congress”
“jan25”
“egypt”
“tunisia”
“arabspring”
“sandy”
“newyork”
“redcross”
“fema”
“swineflu” “ebola”
14
Problem
How can we automatically update
the filters to track a dynamically
evolving topic on Twitter
15
Hashtags as Filters
• Identify a topic on Twitter
• Tweets with hashtags are
more informative
• Users have a lot of freedom
to create them
• Some get popular, most die
16
Exploring Hashtags as Evolving
Filters for Dynamic Topics
Colorado Shooting
17
Exploring Hashtags as Evolving
Filters for Dynamic Topics
Colorado Shooting
Occupy Wall Street
18
Exploring Hashtags as Evolving
Filters for Dynamic Topics
Colorado Shooting
Occupy Wall Street
CS OWS
Tweets: 122,062 Tweets: 6,077,378
Tags: 192,512
Distinct: 12,350
100% Retrieval: 7,763
Tags: 15,963,209
Distinct: 191,602
100% Retrieval: 21,314
19
Exploring Hashtags as Evolving
Filters for Dynamic Topics
Colorado Shooting
Occupy Wall Street
CS OWS
Tweets: 122,062 Tweets: 6,077,378
Tags: 192,512
Distinct: 12,350
100% Retrieval: 7,763
Tags: 15,963,209
Distinct: 191,602
100% Retrieval: 21,314
HASHTAG
FILTERS 20
Colorado Shooting Occupy Wall Street
Hashtag Filters Co-occurrence
Graph
21
Colorado Shooting Occupy Wall Street
Event Related
Hashtags co-occur
with each other
Hashtag Filters Co-occurrence
Graph
22
Summarizing Hashtag Analysis
Starting with one of the event
relevant hashtags, by co-
occurrence we can reach other
relevant hashtags
23
Determining Relevancy of Co-
occurring Hashtags
#indianelection2015
#modikisarkar
Too many
co-occurring hashtags
24
Hashtag Filters distributions
25
Not surprising
It’s a Powerlaw
distribution
Hashtag distributions
26
Top 1% retrieves
around 85% of the
tweets
Hashtag distributions
27
Clustering Co-efficient of Hashtag
Co-occurrence network (1%)
Clustering co-efficient
The top ones co-occur
with each other the best
28
Determining Relevancy of Co-
occurring Hashtags
#indianelection2015
#modikisarkar
Co-occurring:
Threshold δ
Preferably a prominent hashtag
29
Hashtag Co-occurrence
works?
o No. Just co-occurrence does not work
o Many noisy or unrelated hashtags co-occurs
o Determine the “dynamic” relevance of
the top co-occurring hashtag with the
dynamic topic
30
Determining Relevancy of Co-
occurring Hashtags
#indianelection2015
#modikisarkar
Co-occurring:
Threshold
Latest K (200,500)
Narendra Modi: 0.9
BJP: 0.7
NDA: 0.6
India: 0.4
Elections: 0.2
Rahul Gandhi: 0.2
Congress: 0.2
Entity Extraction
and Scoring
δ
Normalized
Frequency
Scoring
31
(Vector Space Model)
Determining Relevancy of Co-
occurring Hashtags (Vector
Space Model)
#indianelection2015
#modikisarkar
Co-occurring:
Threshold
Latest K (200,500)
Narendra Modi: 0.9
BJP: 0.7
NDA: 0.6
India: 0.4
Elections: 0.2
Rahul Gandhi: 0.2
Congress: 0.2
Entity Extraction
and Scoring
Indian General
Election,_2014
Dynamically Updated
Background Knowledge
δ
32
Event Relevant Background
Knowledge
o Wikipedia Event Pages
33
o Wikipedia Event Pages
Event Relevant Background
Knowledge
34
o Entities mentioned on the Event page of
Wikipedia are relevant to the Event
Event Relevant Background
Knowledge
35
o Wikipedia’s Hyperlink structure is very
rich
o Page-Page (Wikipedia) links
Indian General
Election, 2014
Narendra Modi
Rahul Gandhi
NDA (India)UPA (India)
BJP
Indian National
Congress
Event Relevant Background
Knowledge – Graph Structure
36
Determining Relevancy of Co-
occurring Hashtags (Vector
Space Model)
#indianelection2015
#modikisarkar
Co-occurring:
Threshold
Latest K (200,500)
Narendra Modi: 0.9
BJP: 0.7
NDA: 0.6
India: 0.4
Elections: 0.2
Rahul Gandhi: 0.2
Congress: 0.2
Entity Extraction
and Scoring
Indian General
Election,_2014
Extract, Periodically
Update Hyperlink structure
One hop from Event
Page
δ
37
o Hyperlink structure is dynamically
updated
Indian General
Election, 2014
Narendra Modi
Rahul Gandhi
NDA (India)UPA (India)
BJP
Indian National
Congress
10 May 2010
Event Relevant Background
Knowledge
38
o Hyperlink structure is dynamically
updated
Indian General
Election, 2014
Narendra Modi
Rahul Gandhi
NDA (India)UPA (India)
BJP
Indian National
Congress
10 May 2010
29 March 2013
29 March 2013 29 March 2013
29 March 2013
Event Relevant Background
Knowledge
39
o Hyperlink structure is dynamically
updated
Indian General
Election, 2014
Narendra Modi
Rahul Gandhi
NDA (India)UPA (India)
BJP
Indian National
Congress
10 May 2010
29 March 2013
29 March 2013 29 March 2013
29 March 2013
20 May 2013
20 May 2013
Event Relevant Background
Knowledge
40
Determining Relevancy of Co-
occurring Hashtags (Vector
Space Model)
#indianelection2015
#modikisarkar
Co-occurring:
Threshold
Latest K (200,500)
Narendra Modi: 0.9
BJP: 0.7
NDA: 0.6
India: 0.4
Elections: 0.2
Rahul Gandhi: 0.2
Congress: 0.2
Entity Extraction
and Scoring
Indian General
Election,_2014
Extract, Periodically
Update Hyperlink structure
Entity scoring based
on relevance to the Event
One hop from Event
Page
δ
41
o Edge Based Measure
o Link Overlap Measure: Jaccard similarity
o Out(c) are the links in Wikipedia page “c”
o Final Score: r(c,E) = ed(c,E) + oco(c,E)
Hyperlink Entity Scoring
India General
Election, 2014
Narendra Modi
India General
Election, 2014
India General
Election, 2009
1
Mutually
Important
ed (c,E) = 1
ed (c,E) = 2
42
Determining Relevancy of Co-
occurring Hashtags (Vector
Space Model)
#indianelection2015
#modikisarkar
Co-occurring:
Threshold
Latest K (200,500)
Narendra Modi: 0.9
BJP: 0.7
NDA: 0.6
India: 0.4
Elections: 0.2
Rahul Gandhi: 0.2
Congress: 0.2
Entity Extraction
and Scoring
Indian General
Election,_2014
Extract, Periodically
Update Hyperlink structure
Entity scoring based
on relevance to the Event
One hop from Event
Page
Indian General Elec: 1.0
India: 0.9
Elections: 0.7
UPA: 0.6
BJP: 0.3
NDA: 0.3
Narendra Modi: 0.3
δ
43
Determining Relevancy of Co-
occurring Hashtags (Vector
Space Model)
#indianelection2015
#modikisarkar
Co-occurring:
Threshold
Latest K (200,500)
Narendra Modi: 0.9
BJP: 0.7
NDA: 0.6
India: 0.4
Elections: 0.2
Rahul Gandhi: 0.2
Congress: 0.2
Entity Extraction
and Scoring
Indian General
Election,_2014
Extract, Periodically
Update Hyperlink structure
Entity scoring based
on relevance to the Event
One hop from Event
Page
Indian General Elec: 1.0
India: 0.9
Elections: 0.7
UPA: 0.6
BJP: 0.3
NDA: 0.3
Narendra Modi: 0.3
Similarity
Check
Relevance Score: 0.6
δ
44
o Set Based
o Jaccard Similarity
o Considers the entities without the scores
o Vector Based
o Symmetric
o Cosine Similarity
o Asymmetric
o Subsumption Similarity
Similarity Check
45
India General
Election 2014
Narendra
Modi
Intuition behind
Asymmetric
India General
Election 2014
Narendra
Modi
Penalized
Ignored
Similarity
Symmetric
Asymmetric
46
Determining Relevancy of Co-
occurring Hashtags (Vector
Space Model)
#indianelection2015
#modikisarkar
Co-occurring:
Threshold
Latest K (200,500)
Narendra Modi: 0.9
BJP: 0.7
NDA: 0.6
India: 0.4
Elections: 0.2
Rahul Gandhi: 0.2
Congress: 0.2
Entity Extraction
and Scoring
Indian General
Election,_2014
Extract, Periodically
Update Hyperlink structure
Entity scoring based
on relevance to the Event
One hop from Event
Page
Indian General Elec: 1.0
India: 0.9
Elections: 0.7
UPA: 0.6
BJP: 0.3
NDA: 0.3
Narendra Modi: 0.3
Similarity
Check
Relevance Score: 0.6
δ
47
o 2 events
o US Presidential Elections (#election2012)
o Hurricane Sandy (#sandy)
o Top 25 co-occurring hashtags
Evaluation – Dataset
48
o Ranking Problem
o Rank the Top 25 hashtags based on the
relevancy of tweets to the event
o Experiment with all the similarity metrics
o Manually annotated the tweets of these
hashtags as relevant/irrelevant (Gold
Standard)
o Ranking Evaluation Metrics
o Mean Average Precision
o NDCG
Evaluation –
Strategy
49
Evaluation
50
Evaluation
Evaluated tweets comprising of top-
relevant hashtags detected for
dynamic topics
• NDCG - 92% at top-5 Mean Average
Precision
51
A little
pause for
Questions?
52
Personalized Filtering
53
User Interest
Identification/User
Modeling
Filtering Module
Twitter Streaming API
Tweets
Network
Filtered
Tweets
Personalized Filtering
54
User Interest
Identification/User
Modeling
Filtering Module
Twitter Streaming API
Tweets
Network
Filtered
Tweets
Dynamic Topics
as Interests
Interest: Indian Elections
Personalized Filtering
55
User Interest
Identification/User
Modeling
Filtering Module
Twitter Streaming API
Tweets
Network
Filtered
Tweets
A Significant
Module
o User Interest Identification on Twitter
o Content-based (Only Tweets)
o Term-based (semantic, web, #semanticweb)
o Entity-based (sematic web <same as> #semanticweb)
o Interest Graphs derived from knowledge-base
(Hierarchical Interest Graphs)
o Collaborative (Users’ Friends)
o Hybrid
User Modeling
56
A simple solution to most problems I
am trying to solve
Hierarchical
Interest Graphs
58
What is in your mind? (Next
concept/term)
59
What is in your mind? (Next
concept/term)
Fruit
60
What is in your mind? (Next
concept/term)
Fruit
Other Fruit
Names
61
Cognitive Science
o Human memory has been argued to be
structured as a hierarchy of concepts
(Semantic Network)
o Spreading activation theory has been
utilized to simulate search on semantic
network
o This theory has not been well explored
for user interest modeling
62
Hierarchical Interest Graphs
o Extending user profiles from Twitter to
comprise a hierarchy of concepts
o Hierarchy of concepts are derived from
Wikipedia Category Structure
o Each concept in the hierarchy is scored
based on the users extent of interest
63
64
Semantic
Search
Linked Data Metadata
0.8 0.2 0.6
Scores for
Interests
65
User Interests
Internet
Semantic
Search
Linked Data Metadata
Technology
World Wide Web
Semantic
Web
Structured
Information
0.8 0.2 0.6
Scores for
Interests
66
User Interests
Internet
Semantic
Search
Linked Data Metadata
Technology
World Wide Web
Semantic
Web
Structured
Information
0.8 0.2 0.6
Scores for
Interests
67
User Interests
0.7
0.5
0.4
0.3
68
Tweets
Approach
69
Tweets
Approach
70
Wikipedia Category Graph
Contains
Cycles
More abstract:
World Wide Web or
Semantic Web?
71
Wikipedia Hierarchy
Hierarchical Levels
No Cycles
1
2
3
4
5
6
72
Tweets
Approach
73
http://en.wikipedia.org/wiki/Semantic_search
http://en.wikipedia.org/wiki/Ontology
o Extracting Wikipedia entities
o Interest Scoring
o Frequency based
User Profile Generation
Internet
Semantic
Search
Linked Data Metadata
Technology
World Wide Web
Semantic
Web
User Interests
Structured
Information
0.8 0.2 0.6
Scores for
Interests
74
75
Tweets
Approach
76
Cricket
M S Dhoni Virat Kohli
Sachin
Tendulkar
Sports
Indian
Cricket
Indian
Cricketers
0.8 0.2 0.6
0.5
0.4
0.25
0.1
Activation Function
Determines the extent of spreading
Example
o Simple Activation Function
𝐴𝑗 = 𝐴𝑖 × 𝑊𝑖𝑗 × 𝐷𝑛
𝑖=0
𝑖 𝑖𝑠 𝑡ℎ𝑒 𝑐ℎ𝑖𝑙𝑑 𝑜𝑟 𝑠𝑢𝑏𝑐𝑎𝑡𝑒𝑔𝑜𝑟𝑦 𝑜𝑓 𝑗 𝐴𝑐𝑡𝑖𝑣𝑎𝑡𝑒𝑑 .
𝑗 𝑖𝑠 𝑡ℎ𝑒 𝑐𝑎𝑡𝑒𝑔𝑜𝑟𝑦 𝑡𝑜 𝑏𝑒 𝑎𝑐𝑡𝑖𝑣𝑎𝑡𝑒𝑑.
𝑊𝑖𝑗 𝑖𝑠 𝑡ℎ𝑒 𝑒𝑑𝑔𝑒 𝑤𝑒𝑖𝑔ℎ𝑡 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑗 𝑎𝑛𝑑 𝑖.
𝐷 𝑖𝑠 𝑡ℎ𝑒 𝑑𝑒𝑐𝑎𝑦 𝑓𝑎𝑐𝑡𝑜𝑟.
77
Activation Function
o Uneven distribution of nodes in the
hierarchy
o Many-many for category-subcategory
relationships
78
78
Challenges – Wikipedia
Category Graph
o Uneven distribution of nodes in the
hierarchy
o Many-many for category-subcategory
relationships
79
79
Challenges – Wikipedia
Category Graph
o Uneven distribution of nodes in the
hierarchy
o Many-many for category-subcategory
relationships
80
80
Challenges – Wikipedia
Category Graph
81
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
0
50000
100000
150000
200000
250000
300000
NumberofNodes
Hierarchical Level
81
Addressing Uneven Node
Distribution
o Uneven distribution of nodes in the
hierarchy
o Many-many for category-subcategory
relationships
82
82
Challenges – Wikipedia
Category Graph
83
83
Preferential Path Constraint –
Many to Many Links
84
84
Preferential Path Constraint –
Many to Many Links
85
1 2 3 4
85
Preferential Path Constraint –
Many to Many Links
Boosting Common Ancestors
o Nodes that intersect domains/subcategories
activated by diverse entities
86
86
Boosting Common Ancestors
87
Cricket
M S Dhoni Virat Kohli
Sachin
Tendulkar
Sports
Indian
Cricket
Indian
Cricketers3
3
5
5
Michael
Clarke
Shane
Watson
Australian
Cricket
Australian
Cricketers
2
2
87
88
88
Boosting Common Ancestors
o Bell
𝐴𝑗 = 𝐴𝑖 × 𝐹𝑗
𝑛
𝑖=0
o Bell Log
𝐴𝑗 = 𝐴𝑖 × 𝐹𝐿𝑗
𝑛
𝑖=0
o Priority Intersect
𝐴𝑗 = 𝐴𝑖 × 𝐹𝐿𝑗 × 𝑃𝑗𝑖 × 𝐵𝑗
𝑛
𝑖=0
89
Activation Functions
Evaluation
User Study
• 37 Users
• 30K Tweets
Evaluated the top-10 categories of
interests derived from the hierarchy
• 76% Mean Average Precision
• 98% Mean Reciprocal Recall
• 70% are not mentioned in tweets
90
o Working on a Tweet recommendation
system that utilizes Hierarchical
Interest Graph
o Preliminary results are “interesting” 
91
Tweet Recommendation using
Hierarchical Interest Graph
Conclusion
o Focus on “Information” overload instead of
“Data” overload.
o Personalized Information Filtering
o Knowledge-base enabled solutions for
challenges in Tweets filtering
o Wikipedia hyperlink structure and category
graph leveraged for Twitter data filtering
o More Research on User Specific Attribute
Extraction (Personalization) from Twitter
Data
o Activity Estimation
o Location Prediction
93
More at Kno.e.sis
kHealth
Knowledge-enabled Healthcare
Applied to ADHF, Asthma, GI, and Dementia
94
Through physical monitoring and
analysis, our cellphones could act as
an early warning system to detect
serious health conditions, and
provide actionable information
canary in a coal mine
Empowering Individuals (who are not Larry Smarr!) for their own health
kHealth: knowledge-enabled healthcare
95
Social Health
Signals
96
Motivational Scenario
Manually going through
news articles, diabetes
forums, blogs, etc.
- Time consuming
- Relevant?
Interesting?
Informative? Useful?
97
How about all the relevant and important health
information aggregated at one platform?
A diabetic patient is interested in keeping himself up to date with
new information about diabetes
98
Search and Explore
X Controls
Cancer
X = diet, treatment, exercise
(Pattern-based Approach
leveraging domain
semantics)
Top Health News
Informative news about selected
disease
Faceted search (by health topics)
Learn about disease
Source: Wikipedia
Search &
Explore
Top Health
News
Tweet
Traffic
Learn about
Disease
Home
Thanks
Contact:
Email-pavan@knoesis.org
Twitter:@pavankaps
Webpage:
http://knoesis.org/researchers/pavan
99

Más contenido relacionado

Similar a Knowledge base enabled Information Filtering on Social Web -- EMC

Twitter Based Outcome Predictions of 2019 Indian General Elections Using Deci...
Twitter Based Outcome Predictions of 2019 Indian General Elections Using Deci...Twitter Based Outcome Predictions of 2019 Indian General Elections Using Deci...
Twitter Based Outcome Predictions of 2019 Indian General Elections Using Deci...Ferdin Joe John Joseph PhD
 
Indian Elections Summary Report for December 2013
Indian Elections Summary Report for December 2013Indian Elections Summary Report for December 2013
Indian Elections Summary Report for December 2013Simplify360
 
Indian Elections 2014
Indian Elections 2014Indian Elections 2014
Indian Elections 2014Social Samosa
 
2015 hypertext-election prediction
2015 hypertext-election prediction2015 hypertext-election prediction
2015 hypertext-election predictionClaudia Hauff
 
Knowledge discovery in social media mining for market analysis
Knowledge discovery in social media mining for market analysisKnowledge discovery in social media mining for market analysis
Knowledge discovery in social media mining for market analysisSenuri Wijenayake
 
iGB - PR & Link Building Surgery
iGB - PR & Link Building SurgeryiGB - PR & Link Building Surgery
iGB - PR & Link Building SurgeryLaura Crimmons
 
DaCENA Personalized Exploration of Knowledge Graphs Within a Context. Seminar...
DaCENA Personalized Exploration of Knowledge Graphs Within a Context. Seminar...DaCENA Personalized Exploration of Knowledge Graphs Within a Context. Seminar...
DaCENA Personalized Exploration of Knowledge Graphs Within a Context. Seminar...Università degli Studi di Milano-Bicocca
 
You're Hired: Examining Acceptance of Social Media Screening of Job Applicants
You're Hired: Examining Acceptance of Social Media Screening of Job ApplicantsYou're Hired: Examining Acceptance of Social Media Screening of Job Applicants
You're Hired: Examining Acceptance of Social Media Screening of Job ApplicantsToronto Metropolitan University
 
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...Artificial Intelligence Institute at UofSC
 
Pew Research Center 2015 India Presentation
Pew Research Center 2015 India PresentationPew Research Center 2015 India Presentation
Pew Research Center 2015 India PresentationPew Research Center
 
India and Bharat: A Social Media Story
India and Bharat: A Social Media StoryIndia and Bharat: A Social Media Story
India and Bharat: A Social Media StoryGermin8
 
Causal data mining: Identifying causal effects at scale
Causal data mining: Identifying causal effects at scaleCausal data mining: Identifying causal effects at scale
Causal data mining: Identifying causal effects at scaleAmit Sharma
 
2016 How to Create Perfect Storm with SEO and Social Media PPT Presentation- ...
2016 How to Create Perfect Storm with SEO and Social Media PPT Presentation- ...2016 How to Create Perfect Storm with SEO and Social Media PPT Presentation- ...
2016 How to Create Perfect Storm with SEO and Social Media PPT Presentation- ...SimilarWeb - Digital Insights
 
How does Social Media and SEO work together?
How does Social Media and SEO work together? How does Social Media and SEO work together?
How does Social Media and SEO work together? Roy Hinkis
 
Twitter Based Sentiment Analysis of Each Presidential Candidate Using Long Sh...
Twitter Based Sentiment Analysis of Each Presidential Candidate Using Long Sh...Twitter Based Sentiment Analysis of Each Presidential Candidate Using Long Sh...
Twitter Based Sentiment Analysis of Each Presidential Candidate Using Long Sh...CSCJournals
 
How to Find Your Site's True Ranking Factors
How to Find Your Site's True Ranking FactorsHow to Find Your Site's True Ranking Factors
How to Find Your Site's True Ranking FactorsBotify
 

Similar a Knowledge base enabled Information Filtering on Social Web -- EMC (17)

Twitter Based Outcome Predictions of 2019 Indian General Elections Using Deci...
Twitter Based Outcome Predictions of 2019 Indian General Elections Using Deci...Twitter Based Outcome Predictions of 2019 Indian General Elections Using Deci...
Twitter Based Outcome Predictions of 2019 Indian General Elections Using Deci...
 
Indian Elections Summary Report for December 2013
Indian Elections Summary Report for December 2013Indian Elections Summary Report for December 2013
Indian Elections Summary Report for December 2013
 
Indian Elections 2014
Indian Elections 2014Indian Elections 2014
Indian Elections 2014
 
2015 hypertext-election prediction
2015 hypertext-election prediction2015 hypertext-election prediction
2015 hypertext-election prediction
 
Knowledge discovery in social media mining for market analysis
Knowledge discovery in social media mining for market analysisKnowledge discovery in social media mining for market analysis
Knowledge discovery in social media mining for market analysis
 
iGB - PR & Link Building Surgery
iGB - PR & Link Building SurgeryiGB - PR & Link Building Surgery
iGB - PR & Link Building Surgery
 
DaCENA Personalized Exploration of Knowledge Graphs Within a Context. Seminar...
DaCENA Personalized Exploration of Knowledge Graphs Within a Context. Seminar...DaCENA Personalized Exploration of Knowledge Graphs Within a Context. Seminar...
DaCENA Personalized Exploration of Knowledge Graphs Within a Context. Seminar...
 
You're Hired: Examining Acceptance of Social Media Screening of Job Applicants
You're Hired: Examining Acceptance of Social Media Screening of Job ApplicantsYou're Hired: Examining Acceptance of Social Media Screening of Job Applicants
You're Hired: Examining Acceptance of Social Media Screening of Job Applicants
 
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
Harnessing Volume and Velocity Challenge on the Social Web using Crowd-Source...
 
Pew Research Center 2015 India Presentation
Pew Research Center 2015 India PresentationPew Research Center 2015 India Presentation
Pew Research Center 2015 India Presentation
 
India and Bharat: A Social Media Story
India and Bharat: A Social Media StoryIndia and Bharat: A Social Media Story
India and Bharat: A Social Media Story
 
Causal data mining: Identifying causal effects at scale
Causal data mining: Identifying causal effects at scaleCausal data mining: Identifying causal effects at scale
Causal data mining: Identifying causal effects at scale
 
2016 How to Create Perfect Storm with SEO and Social Media PPT Presentation- ...
2016 How to Create Perfect Storm with SEO and Social Media PPT Presentation- ...2016 How to Create Perfect Storm with SEO and Social Media PPT Presentation- ...
2016 How to Create Perfect Storm with SEO and Social Media PPT Presentation- ...
 
How does Social Media and SEO work together?
How does Social Media and SEO work together? How does Social Media and SEO work together?
How does Social Media and SEO work together?
 
Twitter Based Sentiment Analysis of Each Presidential Candidate Using Long Sh...
Twitter Based Sentiment Analysis of Each Presidential Candidate Using Long Sh...Twitter Based Sentiment Analysis of Each Presidential Candidate Using Long Sh...
Twitter Based Sentiment Analysis of Each Presidential Candidate Using Long Sh...
 
How to Find Your Site's True Ranking Factors
How to Find Your Site's True Ranking FactorsHow to Find Your Site's True Ranking Factors
How to Find Your Site's True Ranking Factors
 
ManasGaur_PhD_Dissertation_Defense_March25_2022.pptx
ManasGaur_PhD_Dissertation_Defense_March25_2022.pptxManasGaur_PhD_Dissertation_Defense_March25_2022.pptx
ManasGaur_PhD_Dissertation_Defense_March25_2022.pptx
 

Más de Pavan Kapanipathi

Improving Natural Language Inference Using External Knowledge in the Science ...
Improving Natural Language Inference Using External Knowledge in the Science ...Improving Natural Language Inference Using External Knowledge in the Science ...
Improving Natural Language Inference Using External Knowledge in the Science ...Pavan Kapanipathi
 
Adressing Volume and Velocity Challenge on the Social Web using Crowd Sourced...
Adressing Volume and Velocity Challenge on the Social Web using Crowd Sourced...Adressing Volume and Velocity Challenge on the Social Web using Crowd Sourced...
Adressing Volume and Velocity Challenge on the Social Web using Crowd Sourced...Pavan Kapanipathi
 
Hierarchical Interest Graphs from Twitter
Hierarchical Interest Graphs from TwitterHierarchical Interest Graphs from Twitter
Hierarchical Interest Graphs from TwitterPavan Kapanipathi
 
User Interests Identification From Twitter using Hierarchical Knowledge Base
User Interests Identification From Twitter using Hierarchical Knowledge BaseUser Interests Identification From Twitter using Hierarchical Knowledge Base
User Interests Identification From Twitter using Hierarchical Knowledge BasePavan Kapanipathi
 
Privacy Aware Semantic Dissemination
Privacy Aware Semantic DisseminationPrivacy Aware Semantic Dissemination
Privacy Aware Semantic DisseminationPavan Kapanipathi
 
Personalized Filtering of Twitter Stream
Personalized Filtering of Twitter StreamPersonalized Filtering of Twitter Stream
Personalized Filtering of Twitter StreamPavan Kapanipathi
 

Más de Pavan Kapanipathi (8)

Improving Natural Language Inference Using External Knowledge in the Science ...
Improving Natural Language Inference Using External Knowledge in the Science ...Improving Natural Language Inference Using External Knowledge in the Science ...
Improving Natural Language Inference Using External Knowledge in the Science ...
 
Adressing Volume and Velocity Challenge on the Social Web using Crowd Sourced...
Adressing Volume and Velocity Challenge on the Social Web using Crowd Sourced...Adressing Volume and Velocity Challenge on the Social Web using Crowd Sourced...
Adressing Volume and Velocity Challenge on the Social Web using Crowd Sourced...
 
Hierarchical Interest Graphs from Twitter
Hierarchical Interest Graphs from TwitterHierarchical Interest Graphs from Twitter
Hierarchical Interest Graphs from Twitter
 
User Interests Identification From Twitter using Hierarchical Knowledge Base
User Interests Identification From Twitter using Hierarchical Knowledge BaseUser Interests Identification From Twitter using Hierarchical Knowledge Base
User Interests Identification From Twitter using Hierarchical Knowledge Base
 
Random walk on Graphs
Random walk on GraphsRandom walk on Graphs
Random walk on Graphs
 
SemPuSH: ISWC 2011 Poster
SemPuSH: ISWC 2011 PosterSemPuSH: ISWC 2011 Poster
SemPuSH: ISWC 2011 Poster
 
Privacy Aware Semantic Dissemination
Privacy Aware Semantic DisseminationPrivacy Aware Semantic Dissemination
Privacy Aware Semantic Dissemination
 
Personalized Filtering of Twitter Stream
Personalized Filtering of Twitter StreamPersonalized Filtering of Twitter Stream
Personalized Filtering of Twitter Stream
 

Último

When-technology-and-Humanity-Cross-1.pptx
When-technology-and-Humanity-Cross-1.pptxWhen-technology-and-Humanity-Cross-1.pptx
When-technology-and-Humanity-Cross-1.pptxReaper61
 
fraud storyboards powerpoint media project
fraud storyboards powerpoint media projectfraud storyboards powerpoint media project
fraud storyboards powerpoint media project17mos052
 
Mastering Wealth with YouTube Content Marketing.pdf
Mastering Wealth with YouTube Content Marketing.pdfMastering Wealth with YouTube Content Marketing.pdf
Mastering Wealth with YouTube Content Marketing.pdfTirupati Social Media
 
social media advantages and disadvantages
social media advantages and disadvantagessocial media advantages and disadvantages
social media advantages and disadvantagesmehwishkhan1018786
 
Call Girls In Dwarka ⏩7838079806 ⏩Escort Service In Patel Nagar Delhi
Call Girls In Dwarka ⏩7838079806 ⏩Escort Service In Patel Nagar DelhiCall Girls In Dwarka ⏩7838079806 ⏩Escort Service In Patel Nagar Delhi
Call Girls In Dwarka ⏩7838079806 ⏩Escort Service In Patel Nagar Delhidelhiescort
 
VIP Moti Bagh Call Girls Free Doorstep Delivery 9873777170
VIP Moti Bagh Call Girls Free Doorstep Delivery 9873777170VIP Moti Bagh Call Girls Free Doorstep Delivery 9873777170
VIP Moti Bagh Call Girls Free Doorstep Delivery 9873777170Komal Khan
 
Upgrade Your Twitter Presence with Socio Cosmos
Upgrade Your Twitter Presence with Socio CosmosUpgrade Your Twitter Presence with Socio Cosmos
Upgrade Your Twitter Presence with Socio CosmosSocioCosmos
 
Music Video Codes and Conventions 2 .pptx
Music Video Codes and Conventions 2 .pptxMusic Video Codes and Conventions 2 .pptx
Music Video Codes and Conventions 2 .pptxjenrobinson12
 
Models Call Girls Shettihalli - 7001305949 Escorts Service 50% Off with Cash ...
Models Call Girls Shettihalli - 7001305949 Escorts Service 50% Off with Cash ...Models Call Girls Shettihalli - 7001305949 Escorts Service 50% Off with Cash ...
Models Call Girls Shettihalli - 7001305949 Escorts Service 50% Off with Cash ...jicagig173
 
YouScan Company Overview - Social Media Listening with Visual Insights.pdf
YouScan Company Overview - Social Media Listening with Visual Insights.pdfYouScan Company Overview - Social Media Listening with Visual Insights.pdf
YouScan Company Overview - Social Media Listening with Visual Insights.pdfAlexander Sirach
 
Cosmic Conversations with Sociocosmos...
Cosmic Conversations with Sociocosmos...Cosmic Conversations with Sociocosmos...
Cosmic Conversations with Sociocosmos...SocioCosmos
 
O9654467111 Call Girls In Shahdara Women Seeking Men
O9654467111 Call Girls In Shahdara Women Seeking MenO9654467111 Call Girls In Shahdara Women Seeking Men
O9654467111 Call Girls In Shahdara Women Seeking MenSapana Sha
 
AI Virtual Influencers: The Future of Influencer Marketing
AI Virtual Influencers:  The Future of Influencer MarketingAI Virtual Influencers:  The Future of Influencer Marketing
AI Virtual Influencers: The Future of Influencer MarketingCut-the-SaaS
 
办理伯明翰大学毕业证书文凭学位证书
办理伯明翰大学毕业证书文凭学位证书办理伯明翰大学毕业证书文凭学位证书
办理伯明翰大学毕业证书文凭学位证书saphesg8
 
Protecting Your Little Explorer at Home!
Protecting Your Little Explorer at Home!Protecting Your Little Explorer at Home!
Protecting Your Little Explorer at Home!andrekr997
 

Último (20)

When-technology-and-Humanity-Cross-1.pptx
When-technology-and-Humanity-Cross-1.pptxWhen-technology-and-Humanity-Cross-1.pptx
When-technology-and-Humanity-Cross-1.pptx
 
fraud storyboards powerpoint media project
fraud storyboards powerpoint media projectfraud storyboards powerpoint media project
fraud storyboards powerpoint media project
 
Mastering Wealth with YouTube Content Marketing.pdf
Mastering Wealth with YouTube Content Marketing.pdfMastering Wealth with YouTube Content Marketing.pdf
Mastering Wealth with YouTube Content Marketing.pdf
 
social media advantages and disadvantages
social media advantages and disadvantagessocial media advantages and disadvantages
social media advantages and disadvantages
 
Call Girls In Dwarka ⏩7838079806 ⏩Escort Service In Patel Nagar Delhi
Call Girls In Dwarka ⏩7838079806 ⏩Escort Service In Patel Nagar DelhiCall Girls In Dwarka ⏩7838079806 ⏩Escort Service In Patel Nagar Delhi
Call Girls In Dwarka ⏩7838079806 ⏩Escort Service In Patel Nagar Delhi
 
VIP Moti Bagh Call Girls Free Doorstep Delivery 9873777170
VIP Moti Bagh Call Girls Free Doorstep Delivery 9873777170VIP Moti Bagh Call Girls Free Doorstep Delivery 9873777170
VIP Moti Bagh Call Girls Free Doorstep Delivery 9873777170
 
Upgrade Your Twitter Presence with Socio Cosmos
Upgrade Your Twitter Presence with Socio CosmosUpgrade Your Twitter Presence with Socio Cosmos
Upgrade Your Twitter Presence with Socio Cosmos
 
Music Video Codes and Conventions 2 .pptx
Music Video Codes and Conventions 2 .pptxMusic Video Codes and Conventions 2 .pptx
Music Video Codes and Conventions 2 .pptx
 
Models Call Girls Shettihalli - 7001305949 Escorts Service 50% Off with Cash ...
Models Call Girls Shettihalli - 7001305949 Escorts Service 50% Off with Cash ...Models Call Girls Shettihalli - 7001305949 Escorts Service 50% Off with Cash ...
Models Call Girls Shettihalli - 7001305949 Escorts Service 50% Off with Cash ...
 
YouScan Company Overview - Social Media Listening with Visual Insights.pdf
YouScan Company Overview - Social Media Listening with Visual Insights.pdfYouScan Company Overview - Social Media Listening with Visual Insights.pdf
YouScan Company Overview - Social Media Listening with Visual Insights.pdf
 
Cosmic Conversations with Sociocosmos...
Cosmic Conversations with Sociocosmos...Cosmic Conversations with Sociocosmos...
Cosmic Conversations with Sociocosmos...
 
Hot Sexy call girls in Ramesh Nagar🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Ramesh Nagar🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Ramesh Nagar🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Ramesh Nagar🔝 9953056974 🔝 Delhi escort Service
 
O9654467111 Call Girls In Shahdara Women Seeking Men
O9654467111 Call Girls In Shahdara Women Seeking MenO9654467111 Call Girls In Shahdara Women Seeking Men
O9654467111 Call Girls In Shahdara Women Seeking Men
 
AI Virtual Influencers: The Future of Influencer Marketing
AI Virtual Influencers:  The Future of Influencer MarketingAI Virtual Influencers:  The Future of Influencer Marketing
AI Virtual Influencers: The Future of Influencer Marketing
 
young Call girls in Dwarka sector 23🔝 9953056974 🔝 Delhi escort Service
young Call girls in Dwarka sector 23🔝 9953056974 🔝 Delhi escort Serviceyoung Call girls in Dwarka sector 23🔝 9953056974 🔝 Delhi escort Service
young Call girls in Dwarka sector 23🔝 9953056974 🔝 Delhi escort Service
 
办理伯明翰大学毕业证书文凭学位证书
办理伯明翰大学毕业证书文凭学位证书办理伯明翰大学毕业证书文凭学位证书
办理伯明翰大学毕业证书文凭学位证书
 
FULL ENJOY Call Girls In Mohammadpur (Delhi) Call Us 9953056974
FULL ENJOY Call Girls In Mohammadpur  (Delhi) Call Us 9953056974FULL ENJOY Call Girls In Mohammadpur  (Delhi) Call Us 9953056974
FULL ENJOY Call Girls In Mohammadpur (Delhi) Call Us 9953056974
 
Enjoy ➥8448380779▻ Call Girls In Noida Sector 93 Escorts Delhi NCR
Enjoy ➥8448380779▻ Call Girls In Noida Sector 93 Escorts Delhi NCREnjoy ➥8448380779▻ Call Girls In Noida Sector 93 Escorts Delhi NCR
Enjoy ➥8448380779▻ Call Girls In Noida Sector 93 Escorts Delhi NCR
 
looking for escort 9953056974 Low Rate Call Girls In Vinod Nagar
looking for escort 9953056974 Low Rate Call Girls In  Vinod Nagarlooking for escort 9953056974 Low Rate Call Girls In  Vinod Nagar
looking for escort 9953056974 Low Rate Call Girls In Vinod Nagar
 
Protecting Your Little Explorer at Home!
Protecting Your Little Explorer at Home!Protecting Your Little Explorer at Home!
Protecting Your Little Explorer at Home!
 

Knowledge base enabled Information Filtering on Social Web -- EMC

  • 1. Knowledge-base Enabled Information Filtering on Social Web Pavan Kapanipathi Kno.e.sis Center, Wright State University Advisor: Amit Sheth 1
  • 3. Social Web in 60 secs 3
  • 4. Social Web in 60 secs 500M users generate 500M tweets per day 4
  • 5. Disaster Management Organizations utilize Social Web 35% of 20M tweets during hurricane sandy shared information and news about the disaster 5
  • 8. Personalized Filtering on Social Web Following Dynamically Evolving Topics as interests 8
  • 9. Personalization on Social Web • Following Dynamically Evolving Topics • Indian Elections • US Elections • Heathcare Debate 9
  • 10. Personalization on Social Web • Following Dynamically Evolving Topics • Indian Elections • US Elections • Heathcare Debate 10
  • 12. Dynamic Topics Continuously Evolving on Twitter Entity – Event relevance changes Many entities are involved 12
  • 13. Dynamic Topics Manually crawl using keywords “indianelection”“jan25” “sandy” “swineflu” “ebola” 13
  • 14. Dynamic Topics Manually updating keywords to get topic relevant tweets is not feasible “indianelection” “modi” “bjp” “congress” “jan25” “egypt” “tunisia” “arabspring” “sandy” “newyork” “redcross” “fema” “swineflu” “ebola” 14
  • 15. Problem How can we automatically update the filters to track a dynamically evolving topic on Twitter 15
  • 16. Hashtags as Filters • Identify a topic on Twitter • Tweets with hashtags are more informative • Users have a lot of freedom to create them • Some get popular, most die 16
  • 17. Exploring Hashtags as Evolving Filters for Dynamic Topics Colorado Shooting 17
  • 18. Exploring Hashtags as Evolving Filters for Dynamic Topics Colorado Shooting Occupy Wall Street 18
  • 19. Exploring Hashtags as Evolving Filters for Dynamic Topics Colorado Shooting Occupy Wall Street CS OWS Tweets: 122,062 Tweets: 6,077,378 Tags: 192,512 Distinct: 12,350 100% Retrieval: 7,763 Tags: 15,963,209 Distinct: 191,602 100% Retrieval: 21,314 19
  • 20. Exploring Hashtags as Evolving Filters for Dynamic Topics Colorado Shooting Occupy Wall Street CS OWS Tweets: 122,062 Tweets: 6,077,378 Tags: 192,512 Distinct: 12,350 100% Retrieval: 7,763 Tags: 15,963,209 Distinct: 191,602 100% Retrieval: 21,314 HASHTAG FILTERS 20
  • 21. Colorado Shooting Occupy Wall Street Hashtag Filters Co-occurrence Graph 21
  • 22. Colorado Shooting Occupy Wall Street Event Related Hashtags co-occur with each other Hashtag Filters Co-occurrence Graph 22
  • 23. Summarizing Hashtag Analysis Starting with one of the event relevant hashtags, by co- occurrence we can reach other relevant hashtags 23
  • 24. Determining Relevancy of Co- occurring Hashtags #indianelection2015 #modikisarkar Too many co-occurring hashtags 24
  • 26. Not surprising It’s a Powerlaw distribution Hashtag distributions 26
  • 27. Top 1% retrieves around 85% of the tweets Hashtag distributions 27
  • 28. Clustering Co-efficient of Hashtag Co-occurrence network (1%) Clustering co-efficient The top ones co-occur with each other the best 28
  • 29. Determining Relevancy of Co- occurring Hashtags #indianelection2015 #modikisarkar Co-occurring: Threshold δ Preferably a prominent hashtag 29
  • 30. Hashtag Co-occurrence works? o No. Just co-occurrence does not work o Many noisy or unrelated hashtags co-occurs o Determine the “dynamic” relevance of the top co-occurring hashtag with the dynamic topic 30
  • 31. Determining Relevancy of Co- occurring Hashtags #indianelection2015 #modikisarkar Co-occurring: Threshold Latest K (200,500) Narendra Modi: 0.9 BJP: 0.7 NDA: 0.6 India: 0.4 Elections: 0.2 Rahul Gandhi: 0.2 Congress: 0.2 Entity Extraction and Scoring δ Normalized Frequency Scoring 31 (Vector Space Model)
  • 32. Determining Relevancy of Co- occurring Hashtags (Vector Space Model) #indianelection2015 #modikisarkar Co-occurring: Threshold Latest K (200,500) Narendra Modi: 0.9 BJP: 0.7 NDA: 0.6 India: 0.4 Elections: 0.2 Rahul Gandhi: 0.2 Congress: 0.2 Entity Extraction and Scoring Indian General Election,_2014 Dynamically Updated Background Knowledge δ 32
  • 33. Event Relevant Background Knowledge o Wikipedia Event Pages 33
  • 34. o Wikipedia Event Pages Event Relevant Background Knowledge 34
  • 35. o Entities mentioned on the Event page of Wikipedia are relevant to the Event Event Relevant Background Knowledge 35
  • 36. o Wikipedia’s Hyperlink structure is very rich o Page-Page (Wikipedia) links Indian General Election, 2014 Narendra Modi Rahul Gandhi NDA (India)UPA (India) BJP Indian National Congress Event Relevant Background Knowledge – Graph Structure 36
  • 37. Determining Relevancy of Co- occurring Hashtags (Vector Space Model) #indianelection2015 #modikisarkar Co-occurring: Threshold Latest K (200,500) Narendra Modi: 0.9 BJP: 0.7 NDA: 0.6 India: 0.4 Elections: 0.2 Rahul Gandhi: 0.2 Congress: 0.2 Entity Extraction and Scoring Indian General Election,_2014 Extract, Periodically Update Hyperlink structure One hop from Event Page δ 37
  • 38. o Hyperlink structure is dynamically updated Indian General Election, 2014 Narendra Modi Rahul Gandhi NDA (India)UPA (India) BJP Indian National Congress 10 May 2010 Event Relevant Background Knowledge 38
  • 39. o Hyperlink structure is dynamically updated Indian General Election, 2014 Narendra Modi Rahul Gandhi NDA (India)UPA (India) BJP Indian National Congress 10 May 2010 29 March 2013 29 March 2013 29 March 2013 29 March 2013 Event Relevant Background Knowledge 39
  • 40. o Hyperlink structure is dynamically updated Indian General Election, 2014 Narendra Modi Rahul Gandhi NDA (India)UPA (India) BJP Indian National Congress 10 May 2010 29 March 2013 29 March 2013 29 March 2013 29 March 2013 20 May 2013 20 May 2013 Event Relevant Background Knowledge 40
  • 41. Determining Relevancy of Co- occurring Hashtags (Vector Space Model) #indianelection2015 #modikisarkar Co-occurring: Threshold Latest K (200,500) Narendra Modi: 0.9 BJP: 0.7 NDA: 0.6 India: 0.4 Elections: 0.2 Rahul Gandhi: 0.2 Congress: 0.2 Entity Extraction and Scoring Indian General Election,_2014 Extract, Periodically Update Hyperlink structure Entity scoring based on relevance to the Event One hop from Event Page δ 41
  • 42. o Edge Based Measure o Link Overlap Measure: Jaccard similarity o Out(c) are the links in Wikipedia page “c” o Final Score: r(c,E) = ed(c,E) + oco(c,E) Hyperlink Entity Scoring India General Election, 2014 Narendra Modi India General Election, 2014 India General Election, 2009 1 Mutually Important ed (c,E) = 1 ed (c,E) = 2 42
  • 43. Determining Relevancy of Co- occurring Hashtags (Vector Space Model) #indianelection2015 #modikisarkar Co-occurring: Threshold Latest K (200,500) Narendra Modi: 0.9 BJP: 0.7 NDA: 0.6 India: 0.4 Elections: 0.2 Rahul Gandhi: 0.2 Congress: 0.2 Entity Extraction and Scoring Indian General Election,_2014 Extract, Periodically Update Hyperlink structure Entity scoring based on relevance to the Event One hop from Event Page Indian General Elec: 1.0 India: 0.9 Elections: 0.7 UPA: 0.6 BJP: 0.3 NDA: 0.3 Narendra Modi: 0.3 δ 43
  • 44. Determining Relevancy of Co- occurring Hashtags (Vector Space Model) #indianelection2015 #modikisarkar Co-occurring: Threshold Latest K (200,500) Narendra Modi: 0.9 BJP: 0.7 NDA: 0.6 India: 0.4 Elections: 0.2 Rahul Gandhi: 0.2 Congress: 0.2 Entity Extraction and Scoring Indian General Election,_2014 Extract, Periodically Update Hyperlink structure Entity scoring based on relevance to the Event One hop from Event Page Indian General Elec: 1.0 India: 0.9 Elections: 0.7 UPA: 0.6 BJP: 0.3 NDA: 0.3 Narendra Modi: 0.3 Similarity Check Relevance Score: 0.6 δ 44
  • 45. o Set Based o Jaccard Similarity o Considers the entities without the scores o Vector Based o Symmetric o Cosine Similarity o Asymmetric o Subsumption Similarity Similarity Check 45
  • 46. India General Election 2014 Narendra Modi Intuition behind Asymmetric India General Election 2014 Narendra Modi Penalized Ignored Similarity Symmetric Asymmetric 46
  • 47. Determining Relevancy of Co- occurring Hashtags (Vector Space Model) #indianelection2015 #modikisarkar Co-occurring: Threshold Latest K (200,500) Narendra Modi: 0.9 BJP: 0.7 NDA: 0.6 India: 0.4 Elections: 0.2 Rahul Gandhi: 0.2 Congress: 0.2 Entity Extraction and Scoring Indian General Election,_2014 Extract, Periodically Update Hyperlink structure Entity scoring based on relevance to the Event One hop from Event Page Indian General Elec: 1.0 India: 0.9 Elections: 0.7 UPA: 0.6 BJP: 0.3 NDA: 0.3 Narendra Modi: 0.3 Similarity Check Relevance Score: 0.6 δ 47
  • 48. o 2 events o US Presidential Elections (#election2012) o Hurricane Sandy (#sandy) o Top 25 co-occurring hashtags Evaluation – Dataset 48
  • 49. o Ranking Problem o Rank the Top 25 hashtags based on the relevancy of tweets to the event o Experiment with all the similarity metrics o Manually annotated the tweets of these hashtags as relevant/irrelevant (Gold Standard) o Ranking Evaluation Metrics o Mean Average Precision o NDCG Evaluation – Strategy 49
  • 51. Evaluation Evaluated tweets comprising of top- relevant hashtags detected for dynamic topics • NDCG - 92% at top-5 Mean Average Precision 51
  • 53. Personalized Filtering 53 User Interest Identification/User Modeling Filtering Module Twitter Streaming API Tweets Network Filtered Tweets
  • 54. Personalized Filtering 54 User Interest Identification/User Modeling Filtering Module Twitter Streaming API Tweets Network Filtered Tweets Dynamic Topics as Interests Interest: Indian Elections
  • 55. Personalized Filtering 55 User Interest Identification/User Modeling Filtering Module Twitter Streaming API Tweets Network Filtered Tweets A Significant Module
  • 56. o User Interest Identification on Twitter o Content-based (Only Tweets) o Term-based (semantic, web, #semanticweb) o Entity-based (sematic web <same as> #semanticweb) o Interest Graphs derived from knowledge-base (Hierarchical Interest Graphs) o Collaborative (Users’ Friends) o Hybrid User Modeling 56
  • 57. A simple solution to most problems I am trying to solve
  • 59. What is in your mind? (Next concept/term) 59
  • 60. What is in your mind? (Next concept/term) Fruit 60
  • 61. What is in your mind? (Next concept/term) Fruit Other Fruit Names 61
  • 62. Cognitive Science o Human memory has been argued to be structured as a hierarchy of concepts (Semantic Network) o Spreading activation theory has been utilized to simulate search on semantic network o This theory has not been well explored for user interest modeling 62
  • 63. Hierarchical Interest Graphs o Extending user profiles from Twitter to comprise a hierarchy of concepts o Hierarchy of concepts are derived from Wikipedia Category Structure o Each concept in the hierarchy is scored based on the users extent of interest 63
  • 64. 64
  • 65. Semantic Search Linked Data Metadata 0.8 0.2 0.6 Scores for Interests 65 User Interests
  • 66. Internet Semantic Search Linked Data Metadata Technology World Wide Web Semantic Web Structured Information 0.8 0.2 0.6 Scores for Interests 66 User Interests
  • 67. Internet Semantic Search Linked Data Metadata Technology World Wide Web Semantic Web Structured Information 0.8 0.2 0.6 Scores for Interests 67 User Interests 0.7 0.5 0.4 0.3
  • 70. 70 Wikipedia Category Graph Contains Cycles More abstract: World Wide Web or Semantic Web?
  • 73. 73 http://en.wikipedia.org/wiki/Semantic_search http://en.wikipedia.org/wiki/Ontology o Extracting Wikipedia entities o Interest Scoring o Frequency based User Profile Generation
  • 74. Internet Semantic Search Linked Data Metadata Technology World Wide Web Semantic Web User Interests Structured Information 0.8 0.2 0.6 Scores for Interests 74
  • 76. 76 Cricket M S Dhoni Virat Kohli Sachin Tendulkar Sports Indian Cricket Indian Cricketers 0.8 0.2 0.6 0.5 0.4 0.25 0.1 Activation Function Determines the extent of spreading Example
  • 77. o Simple Activation Function 𝐴𝑗 = 𝐴𝑖 × 𝑊𝑖𝑗 × 𝐷𝑛 𝑖=0 𝑖 𝑖𝑠 𝑡ℎ𝑒 𝑐ℎ𝑖𝑙𝑑 𝑜𝑟 𝑠𝑢𝑏𝑐𝑎𝑡𝑒𝑔𝑜𝑟𝑦 𝑜𝑓 𝑗 𝐴𝑐𝑡𝑖𝑣𝑎𝑡𝑒𝑑 . 𝑗 𝑖𝑠 𝑡ℎ𝑒 𝑐𝑎𝑡𝑒𝑔𝑜𝑟𝑦 𝑡𝑜 𝑏𝑒 𝑎𝑐𝑡𝑖𝑣𝑎𝑡𝑒𝑑. 𝑊𝑖𝑗 𝑖𝑠 𝑡ℎ𝑒 𝑒𝑑𝑔𝑒 𝑤𝑒𝑖𝑔ℎ𝑡 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑗 𝑎𝑛𝑑 𝑖. 𝐷 𝑖𝑠 𝑡ℎ𝑒 𝑑𝑒𝑐𝑎𝑦 𝑓𝑎𝑐𝑡𝑜𝑟. 77 Activation Function
  • 78. o Uneven distribution of nodes in the hierarchy o Many-many for category-subcategory relationships 78 78 Challenges – Wikipedia Category Graph
  • 79. o Uneven distribution of nodes in the hierarchy o Many-many for category-subcategory relationships 79 79 Challenges – Wikipedia Category Graph
  • 80. o Uneven distribution of nodes in the hierarchy o Many-many for category-subcategory relationships 80 80 Challenges – Wikipedia Category Graph
  • 81. 81 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 0 50000 100000 150000 200000 250000 300000 NumberofNodes Hierarchical Level 81 Addressing Uneven Node Distribution
  • 82. o Uneven distribution of nodes in the hierarchy o Many-many for category-subcategory relationships 82 82 Challenges – Wikipedia Category Graph
  • 83. 83 83 Preferential Path Constraint – Many to Many Links
  • 84. 84 84 Preferential Path Constraint – Many to Many Links
  • 85. 85 1 2 3 4 85 Preferential Path Constraint – Many to Many Links
  • 86. Boosting Common Ancestors o Nodes that intersect domains/subcategories activated by diverse entities 86 86
  • 87. Boosting Common Ancestors 87 Cricket M S Dhoni Virat Kohli Sachin Tendulkar Sports Indian Cricket Indian Cricketers3 3 5 5 Michael Clarke Shane Watson Australian Cricket Australian Cricketers 2 2 87
  • 89. o Bell 𝐴𝑗 = 𝐴𝑖 × 𝐹𝑗 𝑛 𝑖=0 o Bell Log 𝐴𝑗 = 𝐴𝑖 × 𝐹𝐿𝑗 𝑛 𝑖=0 o Priority Intersect 𝐴𝑗 = 𝐴𝑖 × 𝐹𝐿𝑗 × 𝑃𝑗𝑖 × 𝐵𝑗 𝑛 𝑖=0 89 Activation Functions
  • 90. Evaluation User Study • 37 Users • 30K Tweets Evaluated the top-10 categories of interests derived from the hierarchy • 76% Mean Average Precision • 98% Mean Reciprocal Recall • 70% are not mentioned in tweets 90
  • 91. o Working on a Tweet recommendation system that utilizes Hierarchical Interest Graph o Preliminary results are “interesting”  91 Tweet Recommendation using Hierarchical Interest Graph
  • 92. Conclusion o Focus on “Information” overload instead of “Data” overload. o Personalized Information Filtering o Knowledge-base enabled solutions for challenges in Tweets filtering o Wikipedia hyperlink structure and category graph leveraged for Twitter data filtering o More Research on User Specific Attribute Extraction (Personalization) from Twitter Data o Activity Estimation o Location Prediction
  • 94. kHealth Knowledge-enabled Healthcare Applied to ADHF, Asthma, GI, and Dementia 94
  • 95. Through physical monitoring and analysis, our cellphones could act as an early warning system to detect serious health conditions, and provide actionable information canary in a coal mine Empowering Individuals (who are not Larry Smarr!) for their own health kHealth: knowledge-enabled healthcare 95
  • 97. Motivational Scenario Manually going through news articles, diabetes forums, blogs, etc. - Time consuming - Relevant? Interesting? Informative? Useful? 97 How about all the relevant and important health information aggregated at one platform? A diabetic patient is interested in keeping himself up to date with new information about diabetes
  • 98. 98 Search and Explore X Controls Cancer X = diet, treatment, exercise (Pattern-based Approach leveraging domain semantics) Top Health News Informative news about selected disease Faceted search (by health topics) Learn about disease Source: Wikipedia Search & Explore Top Health News Tweet Traffic Learn about Disease Home