1) Amit Sheth presented on how knowledge can help machines better understand big data.
2) He discussed challenges like understanding implicit entities, analyzing drug abuse forums, and understanding city traffic using sensors and text.
3) Sheth argued that knowledge graphs and ontologies can help interpret diverse data types and provide contextual understanding to help solve real-world problems.
1. Amit Sheth
Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled Computing:
Wright State University, Dayton, Ohio
Knowledge will Propel Machine
Understanding of Big Data
Keynote at the China Conference on Knowledge Graph and Semantic Computing, Chengdu, China, 26-
29 August 2017. Invited talk at the Summer School on Learning in Data Science: Models, Algorithms
and Tools, Ahmedabad, 17 July 2017. Colloquium at Fraunhofer- Berlin, 23 Aug 2017.
1
2. Machine Intelligence - we will interpret
it much more broadly than Google: “all aspect of machine learning”... We
will define it as machines (any system) performing similar to (nearly
emulating) human intelligence.
For this talk, our focus will be limited to (big) data/content - esp.
How will machines “understand” the data/signals/observations,
so that it can (help) take timely and good (evidence based)
decision and actions.
2
4. • The astounding bandwidth of your
senses is 11 million bits of
information every second.
• In conscious activities like reading,
the human brain distills
approximately 40 bits of
information per second.
…and do it efficiently and at scale
http://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/mobile-white-paper-c11-520862.html
The Brain: Inspiration for Intelligent Processing:
What if we could automate the interpretation of data?
4
5. Credit: Looi Consulting (http://www.looiconsulting.com/home/enterprise-
big-data/)
• In 2008, data generated > storage
available. Less than 0.5% of data get
analyzed.
• Vast variety of data: text > images >
A/V > genome sequencing > IoT.
• Of all the data generated, which data
is relevant, and why? Which data to
analyze? Which data can offer
insight? Who cares for what data?
How to get attention to a human
decision maker? What we need is
intelligent processing to get
actionable, smart data.
A Big Challenge and Opportunity in Recent Times
Scale of Data
Analysis of Data
Different forms
of Data
Uncertainty
of Data
5
6. First used in 2004; redefined in 2013: http://wiki.knoesis.org/index.php/Smart_Data.
Smart data makes sense out of big data.
How do we solve problems with real-world complexity,
gather vast amounts of data, diverse knowledge, and come
up with intelligent decisions and timely actions?
Smart Data provides value from harnessing the challenges
posed by volume, velocity, variety, and veracity of big data,
in-turn providing actionable information and improving
decision making.
6
7. Levels of Abstraction
Hyperthyroidism
Elevated Blood
Pressure
Systolic blood pressure of
150 mmHg
“150”
...
...
Interpreted data
(abductive)
[in OWL]
e.g., diagnosis
Interpreted data
(deductive)
[in OWL]
e.g., threshold
Interpreted data
(deductive)
[in RDF]
e.g., label
Raw data
[in TEXT]
e.g., number
Intellego
SSN Ontology
7
11. Today’s focus is on how do computers better
“understand” diverse, multimodal data
With the focus on the role knowledge plays, often complementing/enhancing
ML and NLP techniques, in contextual “understanding” of data to help solve the
problem for which the data is potentially relevant.
This encompasses topics of information extraction and semantic annotation.
111
13. 13
Short detour: it is becoming easier to find or create
relevant knowledge for a given application
• Existence of large knowledge bases
• Ability to search/find a relevant knowledge bases [WI’13]
• Ability extract a relevant subset [IEEE Big Data’16]
• Ability to enrich - by deriving new concepts and new facts [BIBM’12]
Knowledge graphs are already playing influential roles in many
applications involving big data, starting with search
[15 years of search & knowledge graphs].
14. 14
Knowledge Graphs become prominent
Linked Open Data >
9960 datasets,
> 149 B triples
38.3 M entities and
8.8 B facts Google Knowledge Graph
570 M entities and 18 B facts
Schema.org annotations Linkedln knowledge graph
15. 15
Domain-specific knowledge extraction from LOD
Linked Open
Data
Book related
information?
Filter relevant datasets
Extract relevant portion
of a data set
Project
Gutenberg
DBpedia
DBTropes
Books, Countries, Drugs
Books, movie, games
Books
Book
specific
DBpedia
Book
specific
DBTropes
http://knoesis.org/node/2272
http://knoesis.org/node/2793
16. 16
Ability to enrich knowledge graphs
Atrial fibrillation
Hypertension
Diabetes
Fatigue
Syncope
Weight loss
Chest pain
Discomfort in chest
Dizzy
Shortness of Breath
Nausea
Vomiting
Headache
Cough
Weight gain
Initial knowledge
graph on disorder
and symptoms
Patient Notes
Atrial fibrillation
Hypertension
Diabetes
Chest pain
Weight gain
Discomfort in chest
Cough
Headache
Edema
Shortness of Breath
Initial knowledge base does not know about edema. Can Edema be a symptom of
any of the disorders mentioned according to the patient notes?
http://knoesis.org/node/2642
17. 17
Knowledge plays an indispensable role in deeper
understanding of content
Especially interesting situations:
I. Large amounts of training data are unavailable,
II. The objects to be recognized are complex, such as
implicit entities and highly subjective content,
and
III.Applications need to use complementary or
related data in multiple modalities/media.
18. 18
Challenging Examples/Applications
I. Implicit entity recognition and linking
II. Understanding and analyzing drug abuse related
discussions on web forums
III.Understanding city traffic dynamics using sensor
and textual observations
IV.Emoji similarity and sense disambiguation
19. 19
Implicit Entity Recognition and
Linking
Sujan Perera, Pablo N. Mendes, Adarsh Alex, Amit Sheth, Krishnaprasad Thirunarayan. Implicit Entity Linking in Tweets. Extended
Semantic Web Conference. Heraklion, Crete, Greece : Springer; 2016. p. 118-132. http://knoesis.org/node/2644
Sujan Perera, Pablo Mendes, Amit Sheth, Krishnaprasad Thirunarayan, Adarsh Alex, Christopher Heid, Greg Mott. Implicit Entity
Recognition in Clinical Documents. 4th Joint Conference on Lexical and Computational Semantics (*SEM) 2015. Denver, CO:
Association for Computational Linguistics; 2015. p. 228-238. http://knoesis.org/node/2171
20. 20
Implicit Entity Recognition and Linking
Named Entity Recognition Relationship Extraction Entity Linking Implicit information extraction
24. 24
Understanding and Analyzing Drug Abuse
Related Discussions on Web Forums
Cameron, Delroy, Gary A. Smith, Raminta Daniulaityte, Amit P. Sheth, Drashti Dave, Lu Chen, Gaurish Anand, Robert Carlson, Kera Z.
Watkins, and Russel Falck. "PREDOSE: a semantic web platform for drug abuse epidemiology using social media." Journal of biomedical
informatics 46, no. 6 (2013): 985-997. http://knoesis.org/node/2469
25. Codes Triples (subject-predicate-object)
Suboxone used by injection, negative experience Suboxone injection-causes-Cephalalgia
Suboxone used by injection, amount Suboxone injection-dosage amount-2mg
Suboxone used by injection, positive experience Suboxone injection-has_side_effect-Euphoria
experience sucked, didn’t do
shit, bad headache
feel pretty damn good, feel great
Sentiment Extraction
+ve
-ve
Triples
DOSAGE PRONOUN
INTERVAL Route of Admin.
RELATIONSHIPS SENTIMENTS
DIVERSE DATA TYPES
ENTITIES
I was sent home with 5 x 2 mg Suboxones. I also got a bunch of phenobarbital (I took all 180 mg and it didn't do shit except make me a
walking zombie for 2 days). I waited 24 hours after my last 2 mg dose of Suboxone and tried injecting 4 mg of the bupe. It gave me a
bad headache, for hours, and I almost vomited. I could feel the bupe working but overall the experience sucked.
Of course, junkie that I am, I decided to repeat the experiment. Today, after waiting 48 hours after my last bunk 4 mg injection, I injected
2 mg. There wasn't really any rush to speak of, but after 5 minutes I started to feel pretty damn good. So I injected another 1 mg. That
was about half an hour ago. I feel great now.
Buprenorphine
subClassOf
bupe
Entity Identification
has_slang_term
SuboxoneSubutex
subClassOf
bupey
has_slang_term
Drug Abuse Ontology (DAO)
83 Classes
37 Properties
33:1 Buprenorphine
24:1 Loperamide
25
26. 26
Ontology Lexicon Lexico-ontology Rule-based Grammar
ENTITIES
TRIPLES
EMOTION
INTENSITY
PRONOUN
SENTIMENT
DRUG-FORM
ROUTE OF ADM
SIDEEFFECT
DOSAGE
FREQUENCY
INTERVAL
Suboxone, Kratom, Heroin,
Suboxone-CAUSE-Cephalalgia
disgusted, amazed, irritated
more than, a, few of
I, me, mine, my
Im glad, turn out bad, weird
ointment, tablet, pill, film
smoke, inject, snort, sniff
Itching, blisters, flushing, shaking
hands, difficulty breathing
DOSAGE: <AMT><UNIT>
(e.g. 5mg, 2-3 tabs)
FREQ: <AMT><FREQ_IND><PERIOD>
(e.g. 5 times a week)
INTERVAL: <PERIOD_IND><PERIOD>
(e.g. several years)
PREDOSE: Smarter Data through Shared Context and Data Integration
27. 27
Understanding city traffic using
sensor and textual observations
Pramod Anantharam, Krishnaprasad Thirunarayan, Surendra Marupudi, Amit Sheth, Tanvi Banerjee. Understanding City Traffic
Dynamics Utilizing Sensor and Textual Observations. In 30th AAAI Conference on Artificial Intelligence (AAAI-16). Phoenix, Arizona;
2016. http://knoesis.org/node/2145
Pramod Anantharam, Krishnaprasad Thirunarayan, Amit Sheth. Traffic Analytics using Probabilistic Graphical Models Enhanced with
Knowledge Bases. In 2nd International Workshop on Analytics for Cyber-Physical Systems (ACS-2013) at SIAM International
Conference on Data Mining (SDM 2013). Austin, Texas; 2013. http://knoesis.org/node/2476
28. 28
By 2001 over 285 million Indians lived in
cities, more than in all North American cities
combined (Office of the Registrar General of
India 2001)1.
1 The Crisis of Public Transport in India.
2 IBM Smarter Traffic.
Modes of Transportation in Indian Cities
The Texas Transportation
Institute (TTI) Congestion
report for the United States
Severity of the Traffic Problem
[2011]
2030
29. 29
• What time to start?
• What route to take?
• What is the reason for traffic?
• Wait for some time or re-route?
Questions Asked Daily
32. 32
7 × 24
LDS(1,1), LDS(1,2) ,…., LDS(1,24)
LDS(7,1), LDS(7,2) ,…., LDS(7,24)
.
.
.
di
hj
Mon.
Tue.
Wed.
Thu.
Fri.
Sat.
Sun.
Mon.
Tue.
Wed.
Thu.
Fri.
Sat.
Sun.Speed/travel-time time
series data from a link.
Time series data for each hour of
day (1-24) for each day of week
(Monday – Sunday).
Mean time series computed
for each day of week and
hour of day along with the
medoid.
168 LDS models for each
link; Total models
learned = 425,712 i.e.,
(2,534 links × 168 models
per link).
Step 1: Index data for each link
for day of week and hour of day
utilizing the traffic domain
knowledge for piece-wise linear
approximation
Step 2: Find the “typical”
dynamics by computing the
mean and choosing the medoid
for each hour of day and day of
week
Step 3: Learn LDS parameters for
the medoid for each hour of day
(24 hours) and each day of week
(7 days) resulting in 24 × 7 = 168
models for each link
Learning Context-specific LDS Models
33. 33
Tagging Anomalies with LDS Models
Log likelihood min. and
max. values obtained from
five number summary
Compute Log Likelihood for
each hour of observed data
(di,hj)
LDS(hj,di)
LDS(1,1), LDS(1,2) ,…., LDS(1,24)
LDS(7,1), LDS(7,2) ,….,
LDS(7,24)
.
.
d
i
hj
(Input)
Speed and travel-time time
Observations from a link
Train?
Tag Anomalous hours using the
Log Likelihood Range
Lik(1,1), Lik(1,2) ,…., Lik(1,24)
Lik(7,1), Lik(7,2) ,…., Lik(7,24)
L=
Yes (Training Phase) No
(di,hj) (min. likelihood)
(Output)
Anomalies
.
.
35. 35
Most of the drivers tend to go
5 km/h over the posted speed limit.
There are relatively few drivers who go more than
10 km/h over the posted speed limit.
There are situations in a day where the drivers are going
(forced) below the speed limit e.g., rush hour traffic.
Do these histograms resemble any probability distribution?
Traffic Data: Possible Explanation
37. 37
Pramod Anantharam, Payam Barnaghi, Krishnaprasad Thirunarayan, and Amit Sheth. 2015. Extracting City Traffic Events from Social Streams. ACM
Trans. Intell. Syst. Technol. 6, 4, Article 43 (July 2015), 27 pages. DOI: 10.1145/2717317. http://doi.acm.org/10.1145/2717317/
Last O night O in O CA... O (@ O Half B-LOCATION Moon I-LOCATION Bay B-
LOCATION Brewing I-LOCATION Company O w/ O 8 O others) O
http://t.co/w0eGEJjApY O
Extracting City Events from Textual Data
41. Image Credit:
http://traffic.511.org/index
Overturned Truck
Domain knowledge in the
form of traffic vocabulary
Domain knowledge of traffic flow
synthesized from sensor data
Explained-by
Horizontal operator: relating/mapping data from different modality to a
concept (theme) within a spatio-temporal context;
Spatial context even include what it means to have a slow traffic for the type
of road (http://wiki.knoesis.org/index.php/PCS)
Understanding: Semantic Annotation of Sensor + Textual Data
Utilizing Background Knowledge
41
42. 42
This example demonstrates use of:
• Multimodal data streams (types of events from text - signature from sensor data).
• Multiple sources of declarative knowledge/ontologies.
• Semantic annotations and enrichments.
• Use of rich representation (PGM)
• learned probabilistic models improved using declarative knowledge
• Statistical approach to create normalcy models and understand anomalies using
historical data. Explain anomalies using extracted events.
• use declarative knowledge to approximate nonlinear models using a collection of
linear dynamical systems
• Provide actionable information.
How traffic analysis captures complexity of the real-world?
43. 43
Emoji Similarity and Sense
Disambiguation
Sanjaya Wijeratne, Lakshika Balasuriya, Amit Sheth, Derek Doran. EmojiNet: Building a Machine Readable Sense Inventory for Emoji. In
8th International Conference on Social Informatics (SocInfo 2016). Bellevue, WA, USA; 2016. http://knoesis.org/node/2781
Sanjaya Wijeratne, Lakshika Balasuriya, Amit Sheth, Derek Doran. EmojiNet: An Open Service and API for Emoji Sense Discovery. In
11th International AAAI Conference on Web and Social Media (ICWSM 2017). Montreal, Canada; 2017. http://knoesis.org/node/2819
Sanjaya Wijeratne, Lakshika Balasuriya, Amit Sheth, Derek Doran. A Semantics-Based Measure of Emoji Similarity. In 2017 IEEE/WIC/ACM
International Conference on Web Intelligence (Web Intelligence 2017). Leipzig, Germany; 2017. http://knoesis.org/node/2834
44. 44
• 6B messages with emoji are exchanged everyday!
https://www.appboy.com/blog/emojis-used-in-777-more-campaigns/
45. 45
Understanding Emoji Meanings
• The ability to automatically process, derive meaning, and
interpret text fused with emoji will be essential to
understand emoji
• Having access to knowledge bases that capture emoji
meaning can play a vital role in representing, contextually
disambiguating, and converting emoji into text
• They can help to leverage already existing NLP techniques
for processing and better understanding emoji
http://knoesis.org/node/2781
46. 46
EmojiNet: A machine-readable emoji sense
inventory
http://knoesis.org/node/2819
Creating of EmojiNet, with Nonuple of an emoji
47. 47
Emoji Sense Disambiguation
“The ability to identify the meaning of an emoji in the
context of a message in a computational manner”
Emoji usage in social media with multiple senses
http://knoesis.org/node/2819
Currently there’s no labeled dataset that can be used to solve emoji sense
disambiguation in a supervised learning setting.
48. 48
Tackling Emoji Sense Disambiguation
• Use Simplified LESK algorithm to disambiguate emoji sense
http://knoesis.org/node/2819
49. 49
Emoji Similarity
“Given two or more emoji, how to calculate the semantic similarity
between them in a computational manner?”
Top-5 emoji pairs with highest inter-annotator agreement for each ordinal value from 0 to 4 for two questions. Here,
the Q1 was on the equivalence of the two emoji and the Q2 was on the relatedness between them. Ordinal values 0
and 4 represent the least and the highest relatedness/equivalence, respectively.
http://knoesis.org/node/2834
50. 50
Using EmojiNet to measure Emoji Similarity
• Different types of emoji meanings extracted from EmojiNet are used to model
the meaning of an emoji (more details on http://knoesis.org/node/2834)
51. 51
Using EmojiNet to measure Emoji Similarity
• We combine distributional semantics of words (learned via word
embeddings) and emoji definitions in EmojiNet (external
knowledge) to model emoji embeddings
• Our emoji embeddings models outperform the previous emoji
embedding models (based on purely distributional semantics) by
~10% in a benchmark sentiment analysis task
http://knoesis.org/node/2834
52. 52
Knowledge-based Approaches and the
Resulting Improvements
Problem Domain Use of Knowledge/Knowledge bases Problems we could solve that could
not be solved (well) w/o knowledge
Implicit Entity Linking Adapted UMLS definitions for identifying
medical entities, and Wikipedia and
Twitter data for identifying Twitter entities
Was not solved before
Understanding Drug
Abuse-related
Discussions
Application of Drug Abuse Ontology
along with slang term dictionaries and
grammar
Not solved well at all
Traffic Data Analysis Statistical knowledge extraction and
using ontologies for Twitter event
extraction
Multi-modal data stream correlation
and explanation virtually impossible
Emoji Similarity and
Sense Disambiguation
Generation and application of EmojiNet Emoji interpretation solved much
better
53. 53
Take away
“Data alone is not enough”: https://homes.cs.washington.edu/~pedrod/papers/cacm12.pdf
Consider combining data-centric/bottom up/statistical learning with
knowledge-based/top down techniques
• To improve understanding of simpler content
• To understand complex content and concepts
• To understand heterogeneous/multimodal content