SlideShare una empresa de Scribd logo
1 de 60
Semantics-based Summarization of
Entities in Knowledge Graphs
Kalpa Gunaratna
Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis)
Wright State University
04.19.2017
Advisors: Prof. Amit Sheth and Prof. Krishnaprasad Thirunarayan
Ph.D Committee: Prof. Keke Chen (Kno.e.sis), Prof. Gong Cheng (Nanjing University, China),
Dr. Edward Curry (NUIG, Ireland), Dr. Hamid R. Motahari-Nezhad (IBM Research, USA)
PhD Dissertation Defense
1. Knowledge on the Web and concise presentation
2. Diversity-aware entity summarization
- Using hierarchical conceptual grouping.
3. Enriching knowledge graphs and entity summarization
- Add type semantics to literals and adapt them in summarization.
4. Relatedness-based multi-entity summarization
- Using quadratic multidimensional optimization techniques.
2
Talk overview
3
What are triples?
dbr:Marie_Curie dbo:Person
rdf:type
RDF/Turtle syntax of the triple
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix dbr: <http://dbpedia.org/resource/> .
@prefix dbo: <http://dbpedia.org/ontology/> .
dbr:Marie_Curie rdf:type dbo:Person .
dbr:Marie_Curie dbo:spouse dbr:Pierre_Curie .
Subject Predicate Object
dbr:Marie_Curie dbr:Pierre_Curie
dbo:spouse
4
Open Datasets – Linked Open Data (LOD) cloud
200720102014
2017
Image credits: http://lod-cloud.net/
5
Commercial knowledge graphs
Image credit: Google
6
Entity summarization
spouse: Pierre_Curie
birthPlace: Warsaw
almaMater: ESPCI_ParisTech
workInstitutions: University_of_Paris
knownFor: Radioactivity
……
Entity Description
Knowledge Graph
property
value
Summary for
• Quick understanding
• Performing specific tasks
Millions of entities and billions of facts
Triple  dbr:Marie_Curie dbp:spouse dbr:Pierre_Curie
7
Entities and summaries in early 2000s
Rich Media Reference Page
Baltimore 31, Pit 24
http://www.nfl.com
Quandry Ismail and Tony Banks hook up for their third long
touchdown, this time on a 76-yarder to extend the Raven’s
lead to 31-24 in the third quarter.
Professional
Ravens, Steelers
Bal 31, Pit 24
Quandry Ismail, Tony Banks
Touchdown
NFL.com
2/02/2000
League:
Teams:
Score:
Players:
Event:
Produced by:
Posted date:
Patent: Sheth, Amit, David Avant, and Clemens Bertram. "System and method for creating
a semantic web and its applications in browsing, searching, profiling, personalization and advertising.
" U.S. Patent 6,311,194, issued October 30, 2001.
Publication: Sheth, Amit, Clemens Bertram, David Avant, Brian Hammond, Krys Kochut, and
Yashodhan Warke. "Managing semantic content for the Web." IEEE Internet Computing 6, no. 4 (2002): 80-87.
Taalee’s Rich Media Reference.
Talk: Semantic Web and Information Brokering: Opportunities, Early Commercialization, and Challenges. Keynote at Workshop on Semantic Web: Models, Architectures and
Management. Lisbon, Portugal, Sept. 21, 2000
.
8
Entities and summaries – Today’s Web Search
Google Knowledge Graph
(GKG) facilitates Google
Search.
Summarization is one of
their top priorities*.
* Singhal, A. 2012. Introducing the knowledge graph: things, not strings. Official Google Blog, May.
9
Semantic
Expansion
Semantic
Enrichment
Ranking
Semantic
Relatedness
Conceptual Grouping
Ranking and Summary
Creation
Single Entity Description
FACES Approach
User applications and engagements
Schema
Knowledge
Lexical
Database
Multiple Entity Descriptions
Combinatorial Optimization
&
Multi-entity Summary Creation
Entity related structured data on the Web can be concisely and
comprehensively summarized for efficient and convenient information
presentation. This can be achieved through synergistic use of:
(i) Unsupervised knowledge-based methods to conceptually group,
(ii) Information Retrieval-based techniques to intuitively rank,
(iii) Natural Language Processing techniques to semantically enrich
structured data, and
(iv) Combinatorial optimization techniques to handle relatedness of
multiple entities.
10
Thesis statement
1. Knowledge on the Web and concise presentation
2. Diversity-aware entity summarization
- Using hierarchical conceptual grouping.
3. Enriching knowledge graphs and entity summarization
- Add type semantics to literals and adapt them in summarization.
4. Relatedness-based multi-entity summarization
- Using quadratic multidimensional optimization techniques.
11
Talk overview
12
FACeted Entity Summaries – FACES
Existing approaches focus
on ranking, causing
redundancy in fixed-length
summary
FACES* approach
Ranking
Grouping
FACES produces diversified summaries
from: ranking + grouping
*[AAAI15] Kalpa Gunaratna, Krishnaparasad Thirunarayan, and Amit Sheth. "FACES: Diversity-Aware Entity
Summarization Using Incremental Hierarchical Conceptual Clustering." Twenty-Ninth AAAI Conference on
Artificial Intelligence. 2015.
adds comprehensiveness (through diversity)
knownFor : Radioactivity
field : Chemistry
workInstitutions : University of Paris
spouse : Pierre Curie
rank
13
Pierre Curie
Warsaw
Passy,_Haute-
Savoie
ESPCI_ParisTechUniversity_of_Paris
Radioactivity
Chemistry
Birth
Place
Field
KnownFor
Concise and comprehensive summary
could be: {f1,f2, f6}
Non-faceted summary: {f4, f7, f5}
Entity - Marie Curie
Feature
Set
Facets Features Property Value
FS
F1 f1 spouse Pierre_Curie
F2
f2 birthPlace Warsaw
f3 deathPlace Passy,_Haute-Savoie
F3
f4 almaMater ESPCI_ParisTech
f5 workInstitutions University_of_Paris
f6 knownFor Radioactivity
f7 field Chemistry
Marie Curie
Pierre Curie
Warsaw
Passy,_Haute-
Savoie
ESPCI_ParisTechUniversity_of_Paris
Radioactivity
Chemistry
1
1
1
2
2
3 4
o Number of groups in a feature set is unknown a priori.
– Hence, supervised techniques do not work.
o We want to identify conceptually similar groups.
– E.g., field: chemistry and almaMater: ESPCI_ParisTech to be in
the same group.
o We adapt Cobweb* to get facets, which is:
– Conceptual
– Incremental
– Hierarchical
14
Grouping (clustering) in FACES
* Fisher, D. H. 1987. Knowledge acquisition via incremental conceptual clustering. Machine
learning 2, 2, 139-172.
o Group similar themed features (i.e., conceptually similar).
o Each feature has only two attribute-value pairs.
– Example, birthPlace-Honolulu. Pairs are (property, birthPlace)
and (value, Honolulu).
o Expand property and value of each feature.
o Use the expanded feature set for clustering.
15
How to use Cobweb in FACES
o Get the property label.
o Pre-process
– Remove stop words
– CamelCase, spaces, punctuation processing
– Tokenize
o Get hypernyms for the tokens using a lexical database (e.g.,
WordNet).
o Add hypernyms to the original set of tokens + label and create
the WordSet WS.
16
Property expansion
birthPlace
birthPlace, birth,
place, beginning,
point, area, locality
Expansion
property WordSet
o Get the object URI.
o Get the types (ontology classes) for the URI.
o Pre-process types
– Remove stop words
– CamelCase, spaces, punctuation processing
– Tokenize
o Get hypernyms for types.
o Add hypernyms to the original set of type labels + tokens to
create WordSet WS.
17
Value expansion
Honolulu
place, PopulatedPlace,
populated, point,
area, locality
Expansion
value WordSet
18
WordSet examples
Feature (f) Property expansion Value expansion WordSet (WS)
region:Illinois {region, location,
domain}
{place, PopulatedPlace,
populated, point, area,
locality}
{region, location, domain,
PopulatedPlace, populated, place,
point, area, locality}
birthPlace:Honolulu {birthPlace, birth,
place, beginning,
point, area, locality}
{place, PopulatedPlace,
populated, point, area,
locality}
{birthPlace, birth, place,
beginning, point, area, locality,
PopulatedPlace, populated}
vicePresident:Joe_Bi
den
{vicePresident, vice,
president, corporate
executive, head of
state}
{person, OfficeHolder,
office, holder, organism,
flesh, human body,
occupation, job, staff,
possessor, owner}
{vicePresident, vice, president,
corporate executive, head of
state, person, OfficeHolder,
office, holder, organism, flesh,
human body, occupation, job,
staff, possessor, owner}
predecessor:George
_W._Bush
{predecessor,
forerunner,
precursor}
{person, officeholder,
office, holder, organism,
flesh, human body,
occupation, job, staff,
possessor, owner}
{predecessor, forerunner,
precursor, person, officeholder,
office, holder, organism, flesh,
human body, occupation, job,
staff, possessor, owner}
Original sets are in orange color.
19
WordSet – How it helps for better grouping
region:
illinois
vicePresident:
Joe Biden
birthPlace
:Honolulu
birthPlace
:Honolulu
region:illinois vicePresident:Joe Biden
region, location, domain,
PopulatedPlace, place,
point, area, locality
vicePresident, vice,
president, corporate
executive, head of state,
person, OfficeHolder, human
body, occupation, job, staff
birthPlace:
Honolulu
birthPlace, birth, place,
PopulatedPlace , point,
area, locality, beginning
20
Ranking intuition
workPlace: Washington D.C. residence: White House
New York City Beavercreek
Popular values
Informative features
o Influenced by tf-idf.
o Inf(f): Informativeness of the feature (Uniqueness).
– Example: residence-WhiteHouse
o Po(v): Popularity of the value of the feature (frequent).
– Example: WhiteHouse
o Rank(f): Higher informativeness and popularity.
21
Ranking features
N is the total number of entities
22
Faceted entity summary creation process
(1) (2) (3) (4) (5)
Semantic expansion Cobweb tf-idf based
o Gold standard contains ideal summaries generated by 15
judges.
o An Ideal summary for entity e is denoted by SummI. Then
agreement is the overlap between ideal summaries.
o Summary quality is the overlap between the computer
generated summary (Summ(e)) and the ideal summaries for
the entity.
23
Evaluation
24
Evaluation cont.
• 50 entities in the gold standard. 69 users participated in
the user preference evaluation.
• On average, 44 features per entity.
System
Evaluation 1 – Gold Standard Evaluation 2 –
User PreferenceSummary Length = 5 Summary Length = 10
Avg.
Quality
FACES %
Gain
Time/Entity Avg.
Quality
FACES %
Gain
Study 1 Study 2
FACES 1.4314 NA 0.76 sec 4.3350 NA 84% 54%
RELIN 0.4981 187 % 10.96 sec 2.5188 72 % NA NA
RELINM 0.6008 138 % 11.08 sec 3.0906 40 % 16 % 16 %
SUMMARUM 1.2249 17 % NA 3.4207 27 % NA 30 %
Avg. Agreement 1.9168 4.6415
25
FACES
RELINM
SUMMARUM
birthPlace: Warsaw
workInstitutions: University of Paris
field: Physics
spouse: Pierre Curie
deathPlace: Passy, Haute-Savoie
isPrimaryTopicOf: Marie_Curie
wasDerivedFrom: oldid=547107936
knownFor: Polonium
almaMater: ESPCI
deathPlace: Passy, Haute-Savoie
birthPlace: Poland
birthPlace: Warsaw
birthPlace: Russian_Empire
field: Physics
field: Chemistry
54 %
16 %
30 %
1. Knowledge on the Web and concise presentation
2. Diversity-aware entity summarization
- Using hierarchical conceptual grouping.
3. Enriching knowledge graphs and entity summarization
- Add type semantics to literals and adapt them in summarization.
4. Relatedness-based multi-entity summarization
- Using quadratic multidimensional optimization techniques.
26
Talk overview
o Lot of information encoded in literal format.
– 1608 datatype properties (literal based) vs. 1103 object
properties (entity based) in Dbpedia (2016-04)
o Many literals can be easily typed for proper interpretation and
use.
– Example: in DBpedia, http://dbpedia.org/property/location has
~1,00,000 unique literals that can be directly mapped to entities.
o Added semantics is useful in practical applications such as
summarization, property alignment, data integration, and
dataset profiling.
27
Typing literals (enriching) in knowledge graphs
o FACES can only handle object property based features.
o Our contributions*:
1. Compute types for the values of datatype property based
features (data enrichment) - novel contribution.
2. Adapt and improve ranking algorithms (summarization).
28
Enrichment for entity summarization
*[ESWC16] Kalpa Gunaratna, Krishnaprasad Thirunarayan, Amit Sheth, and Gong Cheng. 'Gleaning Types
for Literals in RDF Triples with Application to Entity Summarization'. In Proc. 13th Extended Semantic
Web Conference (ESWC 2016), 2016, pages 85-100.
Barack Obama
Person
type
“Michelle Obama”
type
String
FACES
Partitioning
o Focus of the literal is not clear unlike URIs.
o May contain several entities or labels matching ontology
classes.
29
Challenges
44th President of the United States
option 1
option 2 option 3
30
Enrichment outcomes
dbr:Barack_Obama dbo:Politician
“44th President of the United States”^^xsd:string
dbr:Joe_Biden
dbr:Barack_Obama
dbp:short
Description
dbr:Calvin_Coolidge “48th Governor of Massachusetts”^^xsd:string
dbo:orderInOffice
dbo:Politician
dbo:President
dbo:Governor
dbp:vicePresident rdf:type
31
Process flow
N-grams
Extractor
Head-word
Detector
Entity Spotter
Phrase
Identifier
Primary
Type Filter
N-grams + Head-word
to Class Label Matcher
Head-word
Semantic
Matcher
Output types for the literal
Pre-processing
Type processing
Head-word to
Class Label
Matcher
Input literal
32
Type computation algorithm flow
Extract n-grams
&
Find focus term
Match focus term
& ontology class
Yes
TYPE
No
Match n-grams, focus term
& ontology class
TYPE
Yes
No
TYPE
Yes
No
Match n-grams, focus term
& entity label
Get entity type
Get similarity of
focus term and
all ontology
classes
TYPE
For non-numeric
literals
Get max
similarity
ontology class
Process focus term
& ontology classes
o Type Set TS(v) is the generated set of types for the value v.
33
Evaluation – type generation metrics
n is the total number of features.
o DBpedia Spotlight is taken as the baseline and there were 1117
unique property-value pairs (features).
o 118 pairs (consisting of labelling properties and noisy features)
were removed.
34
Evaluation
Mean Precision (MP) Any Mean Precision (AMP) Coverage
Our approach 0.8290 0.8829 0.8529
Baseline 0.4867 0.5825 0.5533
35
Ranking Literals
o Ranking equations in the FACES approach do not work.
– Two literals can be unique even if their types and the main
entities are the same.
• Example, “United States President” Vs. “President of the United
States”. Not desirable to search using the whole phrase
(syntactically different but semantically the same).
– A literal can have several entities. Which one to choose?
36
Ranking datatype property features
o Humans recognize popular entities.
o Entities can be mentioned in literals with variations.
o Proposal: Use the popular entities in literals and not the
literals themselves for ranking.
o Functions
– Function ES(v) returns all entities present in the value v.
– Function max(ES(v)) returns the most popular entity in ES(v).
37
Intuitions for ranking
v = “44th President of the United States”
ES(v) = {db:President, db:United States}
max(ES(v)) = db:United States
38
Modified ranking equations
Take the frequent entity for informativeness of feature
and popularity of value.
o Aggregate feature ranking scores for each facet.
o Rank facets based on the aggregated scores.
39
Facet ranking
Rank(f) is the original function and Rank(f)’ is the modified one for datatype property based features.
40
FACES-E entity summary generation
(1) (2) (3) (4) (5)
Semantic expansion
+
Type computation
Cobweb tf-idf based
o The gold standard consists of 20 random entities used in FACES
taken from DBpedia 3.9 and 60 random entities taken from
DBpedia 2015-04.
o 17 human users created ideal summaries (total of 900).
41
Evaluation – FACES-E summary generation
System Summary Length = 5 Summary Length = 10
Avg. Quality % Gain Avg. Quality % Gain
FACES-E 1.5308 -- 4.5320 --
RELIN 0.9611 59 % 3.0988 46 %
RELINM 1.0251 49 % 3.6514 24 %
Avg. Agreement 2.1168 5.4363
1. Knowledge on the Web and concise presentation
2. Diversity-aware entity summarization
- Using hierarchical conceptual grouping.
3. Enriching knowledge graphs and entity summarization
- Add type semantics to literals and adapt them in summarization.
4. Relatedness-based multi-entity summarization
- Using quadratic multidimensional optimization techniques.
42
Talk overview
43
Single vs. Multiple entity summarization
Single entity summarization
importance
diversity
importance
diversity
Multi-entity summarization
importance
diversity
importance
diversity
Improve relatedness
Apple Computer Steve Jobs
Apple Computer Steve Jobs
44
Motivating example
Within one month of the iPod nano and iTunes phone special event, Apple Computer
announced today another special event to be held on October 12. It is to be held at
the California Theater in downtown San Jose, California. The invitation reads, “One
more thing …”, the teasing tagline of Steve Jobs.
founders Steve_Jobs
product IPod
locationCity California
industry Consumer_electronics
after Tim_Cook
knownFor Microcomputer_revolution
title Apple_Inc.
birthPlace California
45
Relatedness-based multi-entity summarization
Entity 1 Entity 2 Entity 3 Entity 4
Problem: Given a collection of entities, we select features belonging to these entities
maximizing inter-entity relatedness, intra-entity importance, and
intra-entity diversity of features.
Summary
- - - - - - -
- - - - - - -
Summary
- - - - - - -
- - - - - - -
Summary
- - - - - - -
- - - - - - -
Summary
- - - - - - -
- - - - - - -
46
We have 3 optimization objectives
Image credit: https://www.flickr.com/photos/68751915@N05/6551525739
Maximize intra-entity feature importance
Maximize intra-entity feature diversity
Maximize inter-entity feature relatedness
47
Formalizing Quadratic Multidimensional Knapsack Problem (QMKP)
Variable x denotes whether the feature is selected or not
We want to maximize the profit considering each knapsack size
Entity 1 Entity 2 Entity 3 Entity 4
Summary
- - - - - - -
- - - - - - -
Summary
- - - - - - -
- - - - - - -
Summary
- - - - - - -
- - - - - - -
Summary
- - - - - - -
- - - - - - -
o We measure the importance of features using the
informativeness and popularity measure used in FACES.
o Within each entity summary, features should have higher
importance.
– Hence, we use a positive weight
48
1. Importance of features
o Features consist of properties and values.
o For properties, we use the expansion method used in FACES
and calculate the Jaccard similarity for properties.
o For values, we measure their relatedness using graph-based
co-appearance for values. We use RDF2Vec model.
o We combine the two measures and get the relatedness
between two features.
49
How to measure relatedness of features
o Each entity summary should have diverse features.
– (i) Penalize relatedness score with a negative weight (i.e.,
maximize diversity).
– (ii) Modify candidate feature selection to improve diversity.
50
2. Diversity of features within summaries
o Maximize profit for related features between summaries.
o Use a positive weight.
51
3. Relatedness of features between summaries
o GRASP – Greedy Randomized Adaptive Search Procedure.
o GRASP provides an approximate solution to QKP.
– We simply use it for multiple constraints to suit QMKP.
o We use a memory-based GRASP implementation version*.
– Construction phase
• Random selection of features (also using a greedy ranking function)
– Local search phase
• Tries to improve answer by replacing selected features
– Update the best solution
o To improve intra-entity diversity of features, we modified Restricted
Candidate List (RCL) of GRASP.
– We use a threshold to filter related features of the same entity.
52
GRASP – for combinatorial optimization
* Yang, Zhen, Guoqing Wang, and Feng Chu. "An effective GRASP and tabu search for the 0–1 quadratic knapsack problem."
Computers & Operations Research 40, no. 5 (2013): 1176-1185.
o 15 judges, 2 datasets, 30 news items, and 850 question
instances.
o Qualitative evaluation.
o Quantitative evaluation
53
Evaluation
o Faceted entity summarization.
– Conceptual (abstract) grouping of features/triples.
– tf-idf based ranking.
o Type computation for literals to enrich knowledge graphs.
– Improve coverage for faceted entity summarization.
o Relatedness-based multi entity summarization.
54
Conclusion
Entity related structured data on the Web can be concisely and comprehensively
summarized for efficient and convenient information presentation. This can be achieved
through synergistic use of:
(i) Unsupervised knowledge-based methods to conceptually group,
(ii) Information Retrieval-based techniques to intuitively rank,
(iii) Natural Language Processing techniques to semantically enrich structured data, and
(iv) Combinatorial optimization techniques to handle relatedness of multiple entities.
Thesis Statement
55
Research areas
Semantic Web, Linked Data &
Semantic Computing
Publications
Tool
Patent
o Conference Papers
[WWW 2017] Hamid R. Motahari Nezhad, Kalpa Gunaratna, and Juan Cappi. “eAssistant: Cognitive Assistance for
Identification and Auto-Triage of Actionable Conversations.” Proceedings of the 26th International Conference on World
Wide Web Companion. International World Wide Web Conferences Steering Committee, 2017.
[ESWC 2016] Kalpa Gunaratna, Krishnaprasad Thirunarayan, Amit Sheth, and Gong Cheng. “Gleaning Types for Literals in
RDF Triples with Application to Entity Summarization”. In Proc. 13th Extended Semantic Web Conference (ESWC 2016),
2016, pages 85-100. DOI=10.1007/978-3-319-34129-3_6
[AAAI 2015] Kalpa Gunaratna, Krishnaprasad Thirunarayan, and Amit Sheth. “FACES: Diversity-Aware Entity Summarization
using Incremental Hierarchical Conceptual Clustering”. 29th AAAI Conference on Artificial Intelligence (AAAI 2015), 2015.
[Semantics 2013] Kalpa Gunaratna, Krishnaprasad Thirunarayan, Prateek Jain, Amit Sheth and Sanjaya Wijeratne. “A
Statistical and Schema Independent Approach for Indentifying Equivalent Properties on Linked Data.” In Proc. 9th
International Conference on Semantic Systems, ACM, 2013, pages 33-40. DOI=10.1145/2506182.2506187.
o Articles
[W 2014] Kalpa Gunaratna, Sarasi Lalithsena and Amit Sheth. “Alignment and Dataset Identification of Linked Data in
Semantic Web.” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery (2014).
o Patents
[P 2015] Kalpa Gunaratna, Hamid Motahari. Adaptive Learning of Actionable Statements in Natural Language Conversation.
US patent filed, January 2016 (pending).
o Edited proceedings
[SumPre 2016] Andreas Thalhammer, Gong Cheng, Kalpa Gunaratna. Proceedings of the 2nd International Workshop on
Summarizing and Presenting Entities and Ontologies (SumPre 2016) co-located with the 13th Extended Semantic Web
Conference (ESWC 2016), Greece, May 30, 2016. CEUR Workshop Proceedings 1605, CEUR-WS.org 2016.
[SumPre 2015] Gong Cheng, Kalpa Gunaratna, Andreas Thalhammer, Heiko Paulheim, Martin Voigt, Roberto Garca. Joint
Proceedings of the 1st International Workshop on Summarizing and Presenting Entities and Ontologies and the 3rd
International Workshop on Human Semantic Web Interfaces (SumPre 2015, HSWI 2015) co-located with the 12th Extended
Semantic Web Conference (ESWC 2015), Portoroz, Slovenia, June 1, 2015. CEUR Workshop Proceedings 1556, CEUR-WS.org
2016.
56
Selected publications
o Top-tier conference publications (AAAI-2015, ESWC-2016 , and
WWW-2017).
o Research internships at well-known places (INSIGHT-Ireland,
NLM-USA, IBM-USA).
o Co-chairing and organizing workshops at international
conferences (SumPre2015 and SumPre2016 at ESWC).
o PC member (e.g., ISWC15 and ESWC16) and W3C working
group member (LDP14).
o Competition winner (IBM Blockchain Hackathon runner-up,
National Best Quality Software award finalist).
o Travel and professional development grants (AAAI, WS-GSA).
o US patent application (filed with IBM).
57
Selected accomplishments
58
Acknowledgements
Prof. Amit Sheth
(Advisor)
Prof. Krishnaprasad Thirunarayan
(Advisor)
Dr. Edward Curry
NUIG, Ireland
Prof. Gong Cheng
Nanjing University, China
Dr. Hamid R. Motahari Nezhad
IBM Research, USA
Prof. Keke Chen
59
Thank You
Dr. Olivier Bodenreider
NLM, USA
Dr. Ajith Ranabahu
Amazon, USA
Dr. Gamini Palihawadana
My Family for always
encouraging me …
My colleagues
at Kno.e.sis …
60
Thank You
http://knoesis.wright.edu/researchers/kalpa
kalpa@knoesis.org
Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled Computing
Wright State University, Dayton, Ohio, USA
Questions ?
All trademarks, logos, and images used in this presentation belong to their respective owners.

Más contenido relacionado

La actualidad más candente

Looking for Commonsense in the Semantic Web
Looking for Commonsense in the Semantic WebLooking for Commonsense in the Semantic Web
Looking for Commonsense in the Semantic WebValentina Presutti
 
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...Jonathan Stray
 
Ontology Tutorial: Semantic Technology for Intelligence, Defense and Security
Ontology Tutorial: Semantic Technology for Intelligence, Defense and SecurityOntology Tutorial: Semantic Technology for Intelligence, Defense and Security
Ontology Tutorial: Semantic Technology for Intelligence, Defense and SecurityBarry Smith
 
User-Generated Content on Social Media
User-Generated Content on Social MediaUser-Generated Content on Social Media
User-Generated Content on Social MediaMeena Nagarajan
 
Frontiers of Computational Journalism week 8 - Visualization and Network Anal...
Frontiers of Computational Journalism week 8 - Visualization and Network Anal...Frontiers of Computational Journalism week 8 - Visualization and Network Anal...
Frontiers of Computational Journalism week 8 - Visualization and Network Anal...Jonathan Stray
 
Frontiers of Computational Journalism week 3 - Information Filter Design
Frontiers of Computational Journalism week 3 - Information Filter DesignFrontiers of Computational Journalism week 3 - Information Filter Design
Frontiers of Computational Journalism week 3 - Information Filter DesignJonathan Stray
 
International Collaboration Networks in the Emerging (Big) Data Science
International Collaboration Networks in the Emerging (Big) Data ScienceInternational Collaboration Networks in the Emerging (Big) Data Science
International Collaboration Networks in the Emerging (Big) Data Sciencedatasciencekorea
 
Frontiers of Computational Journalism week 2 - Text Analysis
Frontiers of Computational Journalism week 2 - Text AnalysisFrontiers of Computational Journalism week 2 - Text Analysis
Frontiers of Computational Journalism week 2 - Text AnalysisJonathan Stray
 
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
20120718 linkedopendataandnextgenerationsciencemcguinnessesip finalDeborah McGuinness
 
Philosophy of Deep Learning
Philosophy of Deep LearningPhilosophy of Deep Learning
Philosophy of Deep LearningMelanie Swan
 
Open IE tutorial 2018
Open IE tutorial 2018Open IE tutorial 2018
Open IE tutorial 2018Andre Freitas
 
Building AI Applications using Knowledge Graphs
Building AI Applications using Knowledge GraphsBuilding AI Applications using Knowledge Graphs
Building AI Applications using Knowledge GraphsAndre Freitas
 
AI Beyond Deep Learning
AI Beyond Deep LearningAI Beyond Deep Learning
AI Beyond Deep LearningAndre Freitas
 
Deep learning 1.0 and Beyond, Part 1
Deep learning 1.0 and Beyond, Part 1Deep learning 1.0 and Beyond, Part 1
Deep learning 1.0 and Beyond, Part 1Deakin University
 
Information Visualization for Social Network Analysis,
 Information Visualization for Social Network Analysis,  Information Visualization for Social Network Analysis,
Information Visualization for Social Network Analysis, University of Maryland
 

La actualidad más candente (20)

Looking for Commonsense in the Semantic Web
Looking for Commonsense in the Semantic WebLooking for Commonsense in the Semantic Web
Looking for Commonsense in the Semantic Web
 
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
Frontiers of Computational Journalism week 1 - Introduction and High Dimensio...
 
Ontology Tutorial: Semantic Technology for Intelligence, Defense and Security
Ontology Tutorial: Semantic Technology for Intelligence, Defense and SecurityOntology Tutorial: Semantic Technology for Intelligence, Defense and Security
Ontology Tutorial: Semantic Technology for Intelligence, Defense and Security
 
User-Generated Content on Social Media
User-Generated Content on Social MediaUser-Generated Content on Social Media
User-Generated Content on Social Media
 
Frontiers of Computational Journalism week 8 - Visualization and Network Anal...
Frontiers of Computational Journalism week 8 - Visualization and Network Anal...Frontiers of Computational Journalism week 8 - Visualization and Network Anal...
Frontiers of Computational Journalism week 8 - Visualization and Network Anal...
 
Domain-specific Knowledge Extraction from the Web of Data
Domain-specific Knowledge Extraction from the Web of DataDomain-specific Knowledge Extraction from the Web of Data
Domain-specific Knowledge Extraction from the Web of Data
 
Frontiers of Computational Journalism week 3 - Information Filter Design
Frontiers of Computational Journalism week 3 - Information Filter DesignFrontiers of Computational Journalism week 3 - Information Filter Design
Frontiers of Computational Journalism week 3 - Information Filter Design
 
International Collaboration Networks in the Emerging (Big) Data Science
International Collaboration Networks in the Emerging (Big) Data ScienceInternational Collaboration Networks in the Emerging (Big) Data Science
International Collaboration Networks in the Emerging (Big) Data Science
 
Frontiers of Computational Journalism week 2 - Text Analysis
Frontiers of Computational Journalism week 2 - Text AnalysisFrontiers of Computational Journalism week 2 - Text Analysis
Frontiers of Computational Journalism week 2 - Text Analysis
 
Human Computation
Human ComputationHuman Computation
Human Computation
 
ITWS Capstone Lecture (Spring 2013)
ITWS Capstone Lecture (Spring 2013)ITWS Capstone Lecture (Spring 2013)
ITWS Capstone Lecture (Spring 2013)
 
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
20120718 linkedopendataandnextgenerationsciencemcguinnessesip final
 
Philosophy of Deep Learning
Philosophy of Deep LearningPhilosophy of Deep Learning
Philosophy of Deep Learning
 
Open IE tutorial 2018
Open IE tutorial 2018Open IE tutorial 2018
Open IE tutorial 2018
 
Building AI Applications using Knowledge Graphs
Building AI Applications using Knowledge GraphsBuilding AI Applications using Knowledge Graphs
Building AI Applications using Knowledge Graphs
 
AI Beyond Deep Learning
AI Beyond Deep LearningAI Beyond Deep Learning
AI Beyond Deep Learning
 
Deep learning 1.0 and Beyond, Part 1
Deep learning 1.0 and Beyond, Part 1Deep learning 1.0 and Beyond, Part 1
Deep learning 1.0 and Beyond, Part 1
 
Information Visualization for Social Network Analysis,
 Information Visualization for Social Network Analysis,  Information Visualization for Social Network Analysis,
Information Visualization for Social Network Analysis,
 
The Semantic Web: RPI ITWS Capstone (Fall 2012)
The Semantic Web: RPI ITWS Capstone (Fall 2012)The Semantic Web: RPI ITWS Capstone (Fall 2012)
The Semantic Web: RPI ITWS Capstone (Fall 2012)
 
Ibrahim ramadan paper
Ibrahim ramadan paperIbrahim ramadan paper
Ibrahim ramadan paper
 

Similar a Semantics based Summarization of Entities in Knowledge Graphs

Knowledge Graph Maintenance
Knowledge Graph MaintenanceKnowledge Graph Maintenance
Knowledge Graph MaintenancePaul Groth
 
PhD Day: Entity Linking using Generic Linked Data Datasets
PhD Day: Entity Linking using Generic Linked Data DatasetsPhD Day: Entity Linking using Generic Linked Data Datasets
PhD Day: Entity Linking using Generic Linked Data DatasetsBianca Pereira
 
Gleaning Types for Literals in RDF with Application to Entity Summarization
Gleaning Types for Literals in RDF with Application to Entity SummarizationGleaning Types for Literals in RDF with Application to Entity Summarization
Gleaning Types for Literals in RDF with Application to Entity SummarizationKalpa Gunaratna
 
#SEOisAEO: How to get Entities and Their Attributes in the Knowledge Graph
#SEOisAEO: How to get Entities and Their Attributes in the Knowledge Graph#SEOisAEO: How to get Entities and Their Attributes in the Knowledge Graph
#SEOisAEO: How to get Entities and Their Attributes in the Knowledge GraphAnton Shulke
 
Reading Group 2013 (DERI NUIG)
Reading Group 2013 (DERI NUIG)Reading Group 2013 (DERI NUIG)
Reading Group 2013 (DERI NUIG)Bianca Pereira
 
[ENCORE webinar] Artificial Intelligence for mapping skills of the future
[ENCORE webinar] Artificial Intelligence for mapping skills of the future[ENCORE webinar] Artificial Intelligence for mapping skills of the future
[ENCORE webinar] Artificial Intelligence for mapping skills of the futureEADTU
 
Artificial Intelligence and Human Expertise to Foresee Green, Digital and Ent...
Artificial Intelligence and Human Expertise to Foresee Green, Digital and Ent...Artificial Intelligence and Human Expertise to Foresee Green, Digital and Ent...
Artificial Intelligence and Human Expertise to Foresee Green, Digital and Ent...EADTU
 
Developing Cognitive Systems to Support Team Cognition
Developing Cognitive Systems to Support Team CognitionDeveloping Cognitive Systems to Support Team Cognition
Developing Cognitive Systems to Support Team Cognitiondiannepatricia
 
The Architecture of Understanding
The Architecture of UnderstandingThe Architecture of Understanding
The Architecture of UnderstandingPeter Morville
 
Discoverability and Web-Enabled Science - #ScholarAfrica
Discoverability and Web-Enabled Science - #ScholarAfricaDiscoverability and Web-Enabled Science - #ScholarAfrica
Discoverability and Web-Enabled Science - #ScholarAfricaKaitlin Thaney
 
Collaboratively creating a network of ideas, data and software
Collaboratively creating a network of ideas, data and softwareCollaboratively creating a network of ideas, data and software
Collaboratively creating a network of ideas, data and softwareAnita de Waard
 
Richardson Reg102010
Richardson Reg102010Richardson Reg102010
Richardson Reg102010Barb Jansen
 
The big 6 presentation yr 6
The big 6 presentation yr 6 The big 6 presentation yr 6
The big 6 presentation yr 6 lisluandaprimary
 
Stewardship and long term preservation of earth science data
Stewardship and long term preservation of earth science dataStewardship and long term preservation of earth science data
Stewardship and long term preservation of earth science dataNancy Hoebelheinrich
 
Managing Metadata for Science and Technology Studies: the RISIS case
Managing Metadata for Science and Technology Studies: the RISIS caseManaging Metadata for Science and Technology Studies: the RISIS case
Managing Metadata for Science and Technology Studies: the RISIS caseRinke Hoekstra
 
Linked Open Data (LOD) part 1
Linked Open Data (LOD) part 1Linked Open Data (LOD) part 1
Linked Open Data (LOD) part 1IPLODProject
 
Lecture: Semantic Word Clouds
Lecture: Semantic Word CloudsLecture: Semantic Word Clouds
Lecture: Semantic Word CloudsMarina Santini
 
Project Proposal Topics Modeling (Ir)
Project Proposal    Topics Modeling (Ir)Project Proposal    Topics Modeling (Ir)
Project Proposal Topics Modeling (Ir)Svitlana volkova
 

Similar a Semantics based Summarization of Entities in Knowledge Graphs (20)

Knowledge Graph Maintenance
Knowledge Graph MaintenanceKnowledge Graph Maintenance
Knowledge Graph Maintenance
 
PhD Day: Entity Linking using Generic Linked Data Datasets
PhD Day: Entity Linking using Generic Linked Data DatasetsPhD Day: Entity Linking using Generic Linked Data Datasets
PhD Day: Entity Linking using Generic Linked Data Datasets
 
Gleaning Types for Literals in RDF with Application to Entity Summarization
Gleaning Types for Literals in RDF with Application to Entity SummarizationGleaning Types for Literals in RDF with Application to Entity Summarization
Gleaning Types for Literals in RDF with Application to Entity Summarization
 
#SEOisAEO: How to get Entities and Their Attributes in the Knowledge Graph
#SEOisAEO: How to get Entities and Their Attributes in the Knowledge Graph#SEOisAEO: How to get Entities and Their Attributes in the Knowledge Graph
#SEOisAEO: How to get Entities and Their Attributes in the Knowledge Graph
 
Reading Group 2013 (DERI NUIG)
Reading Group 2013 (DERI NUIG)Reading Group 2013 (DERI NUIG)
Reading Group 2013 (DERI NUIG)
 
[ENCORE webinar] Artificial Intelligence for mapping skills of the future
[ENCORE webinar] Artificial Intelligence for mapping skills of the future[ENCORE webinar] Artificial Intelligence for mapping skills of the future
[ENCORE webinar] Artificial Intelligence for mapping skills of the future
 
Artificial Intelligence and Human Expertise to Foresee Green, Digital and Ent...
Artificial Intelligence and Human Expertise to Foresee Green, Digital and Ent...Artificial Intelligence and Human Expertise to Foresee Green, Digital and Ent...
Artificial Intelligence and Human Expertise to Foresee Green, Digital and Ent...
 
Developing Cognitive Systems to Support Team Cognition
Developing Cognitive Systems to Support Team CognitionDeveloping Cognitive Systems to Support Team Cognition
Developing Cognitive Systems to Support Team Cognition
 
The Architecture of Understanding
The Architecture of UnderstandingThe Architecture of Understanding
The Architecture of Understanding
 
Discoverability and Web-Enabled Science - #ScholarAfrica
Discoverability and Web-Enabled Science - #ScholarAfricaDiscoverability and Web-Enabled Science - #ScholarAfrica
Discoverability and Web-Enabled Science - #ScholarAfrica
 
Collaboratively creating a network of ideas, data and software
Collaboratively creating a network of ideas, data and softwareCollaboratively creating a network of ideas, data and software
Collaboratively creating a network of ideas, data and software
 
Richardson Reg102010
Richardson Reg102010Richardson Reg102010
Richardson Reg102010
 
The big 6 presentation yr 6
The big 6 presentation yr 6 The big 6 presentation yr 6
The big 6 presentation yr 6
 
Stewardship and long term preservation of earth science data
Stewardship and long term preservation of earth science dataStewardship and long term preservation of earth science data
Stewardship and long term preservation of earth science data
 
Managing Metadata for Science and Technology Studies: the RISIS case
Managing Metadata for Science and Technology Studies: the RISIS caseManaging Metadata for Science and Technology Studies: the RISIS case
Managing Metadata for Science and Technology Studies: the RISIS case
 
Linked Open Data (LOD) part 1
Linked Open Data (LOD) part 1Linked Open Data (LOD) part 1
Linked Open Data (LOD) part 1
 
120918 cádiz ecer
120918 cádiz ecer120918 cádiz ecer
120918 cádiz ecer
 
Text Analytics - JCC2014 Kimelfeld
Text Analytics - JCC2014 KimelfeldText Analytics - JCC2014 Kimelfeld
Text Analytics - JCC2014 Kimelfeld
 
Lecture: Semantic Word Clouds
Lecture: Semantic Word CloudsLecture: Semantic Word Clouds
Lecture: Semantic Word Clouds
 
Project Proposal Topics Modeling (Ir)
Project Proposal    Topics Modeling (Ir)Project Proposal    Topics Modeling (Ir)
Project Proposal Topics Modeling (Ir)
 

Último

Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 

Último (20)

Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 

Semantics based Summarization of Entities in Knowledge Graphs

  • 1. Semantics-based Summarization of Entities in Knowledge Graphs Kalpa Gunaratna Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis) Wright State University 04.19.2017 Advisors: Prof. Amit Sheth and Prof. Krishnaprasad Thirunarayan Ph.D Committee: Prof. Keke Chen (Kno.e.sis), Prof. Gong Cheng (Nanjing University, China), Dr. Edward Curry (NUIG, Ireland), Dr. Hamid R. Motahari-Nezhad (IBM Research, USA) PhD Dissertation Defense
  • 2. 1. Knowledge on the Web and concise presentation 2. Diversity-aware entity summarization - Using hierarchical conceptual grouping. 3. Enriching knowledge graphs and entity summarization - Add type semantics to literals and adapt them in summarization. 4. Relatedness-based multi-entity summarization - Using quadratic multidimensional optimization techniques. 2 Talk overview
  • 3. 3 What are triples? dbr:Marie_Curie dbo:Person rdf:type RDF/Turtle syntax of the triple @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix dbr: <http://dbpedia.org/resource/> . @prefix dbo: <http://dbpedia.org/ontology/> . dbr:Marie_Curie rdf:type dbo:Person . dbr:Marie_Curie dbo:spouse dbr:Pierre_Curie . Subject Predicate Object dbr:Marie_Curie dbr:Pierre_Curie dbo:spouse
  • 4. 4 Open Datasets – Linked Open Data (LOD) cloud 200720102014 2017 Image credits: http://lod-cloud.net/
  • 6. 6 Entity summarization spouse: Pierre_Curie birthPlace: Warsaw almaMater: ESPCI_ParisTech workInstitutions: University_of_Paris knownFor: Radioactivity …… Entity Description Knowledge Graph property value Summary for • Quick understanding • Performing specific tasks Millions of entities and billions of facts Triple  dbr:Marie_Curie dbp:spouse dbr:Pierre_Curie
  • 7. 7 Entities and summaries in early 2000s Rich Media Reference Page Baltimore 31, Pit 24 http://www.nfl.com Quandry Ismail and Tony Banks hook up for their third long touchdown, this time on a 76-yarder to extend the Raven’s lead to 31-24 in the third quarter. Professional Ravens, Steelers Bal 31, Pit 24 Quandry Ismail, Tony Banks Touchdown NFL.com 2/02/2000 League: Teams: Score: Players: Event: Produced by: Posted date: Patent: Sheth, Amit, David Avant, and Clemens Bertram. "System and method for creating a semantic web and its applications in browsing, searching, profiling, personalization and advertising. " U.S. Patent 6,311,194, issued October 30, 2001. Publication: Sheth, Amit, Clemens Bertram, David Avant, Brian Hammond, Krys Kochut, and Yashodhan Warke. "Managing semantic content for the Web." IEEE Internet Computing 6, no. 4 (2002): 80-87. Taalee’s Rich Media Reference. Talk: Semantic Web and Information Brokering: Opportunities, Early Commercialization, and Challenges. Keynote at Workshop on Semantic Web: Models, Architectures and Management. Lisbon, Portugal, Sept. 21, 2000 .
  • 8. 8 Entities and summaries – Today’s Web Search Google Knowledge Graph (GKG) facilitates Google Search. Summarization is one of their top priorities*. * Singhal, A. 2012. Introducing the knowledge graph: things, not strings. Official Google Blog, May.
  • 9. 9 Semantic Expansion Semantic Enrichment Ranking Semantic Relatedness Conceptual Grouping Ranking and Summary Creation Single Entity Description FACES Approach User applications and engagements Schema Knowledge Lexical Database Multiple Entity Descriptions Combinatorial Optimization & Multi-entity Summary Creation
  • 10. Entity related structured data on the Web can be concisely and comprehensively summarized for efficient and convenient information presentation. This can be achieved through synergistic use of: (i) Unsupervised knowledge-based methods to conceptually group, (ii) Information Retrieval-based techniques to intuitively rank, (iii) Natural Language Processing techniques to semantically enrich structured data, and (iv) Combinatorial optimization techniques to handle relatedness of multiple entities. 10 Thesis statement
  • 11. 1. Knowledge on the Web and concise presentation 2. Diversity-aware entity summarization - Using hierarchical conceptual grouping. 3. Enriching knowledge graphs and entity summarization - Add type semantics to literals and adapt them in summarization. 4. Relatedness-based multi-entity summarization - Using quadratic multidimensional optimization techniques. 11 Talk overview
  • 12. 12 FACeted Entity Summaries – FACES Existing approaches focus on ranking, causing redundancy in fixed-length summary FACES* approach Ranking Grouping FACES produces diversified summaries from: ranking + grouping *[AAAI15] Kalpa Gunaratna, Krishnaparasad Thirunarayan, and Amit Sheth. "FACES: Diversity-Aware Entity Summarization Using Incremental Hierarchical Conceptual Clustering." Twenty-Ninth AAAI Conference on Artificial Intelligence. 2015. adds comprehensiveness (through diversity) knownFor : Radioactivity field : Chemistry workInstitutions : University of Paris spouse : Pierre Curie rank
  • 13. 13 Pierre Curie Warsaw Passy,_Haute- Savoie ESPCI_ParisTechUniversity_of_Paris Radioactivity Chemistry Birth Place Field KnownFor Concise and comprehensive summary could be: {f1,f2, f6} Non-faceted summary: {f4, f7, f5} Entity - Marie Curie Feature Set Facets Features Property Value FS F1 f1 spouse Pierre_Curie F2 f2 birthPlace Warsaw f3 deathPlace Passy,_Haute-Savoie F3 f4 almaMater ESPCI_ParisTech f5 workInstitutions University_of_Paris f6 knownFor Radioactivity f7 field Chemistry Marie Curie Pierre Curie Warsaw Passy,_Haute- Savoie ESPCI_ParisTechUniversity_of_Paris Radioactivity Chemistry 1 1 1 2 2 3 4
  • 14. o Number of groups in a feature set is unknown a priori. – Hence, supervised techniques do not work. o We want to identify conceptually similar groups. – E.g., field: chemistry and almaMater: ESPCI_ParisTech to be in the same group. o We adapt Cobweb* to get facets, which is: – Conceptual – Incremental – Hierarchical 14 Grouping (clustering) in FACES * Fisher, D. H. 1987. Knowledge acquisition via incremental conceptual clustering. Machine learning 2, 2, 139-172.
  • 15. o Group similar themed features (i.e., conceptually similar). o Each feature has only two attribute-value pairs. – Example, birthPlace-Honolulu. Pairs are (property, birthPlace) and (value, Honolulu). o Expand property and value of each feature. o Use the expanded feature set for clustering. 15 How to use Cobweb in FACES
  • 16. o Get the property label. o Pre-process – Remove stop words – CamelCase, spaces, punctuation processing – Tokenize o Get hypernyms for the tokens using a lexical database (e.g., WordNet). o Add hypernyms to the original set of tokens + label and create the WordSet WS. 16 Property expansion birthPlace birthPlace, birth, place, beginning, point, area, locality Expansion property WordSet
  • 17. o Get the object URI. o Get the types (ontology classes) for the URI. o Pre-process types – Remove stop words – CamelCase, spaces, punctuation processing – Tokenize o Get hypernyms for types. o Add hypernyms to the original set of type labels + tokens to create WordSet WS. 17 Value expansion Honolulu place, PopulatedPlace, populated, point, area, locality Expansion value WordSet
  • 18. 18 WordSet examples Feature (f) Property expansion Value expansion WordSet (WS) region:Illinois {region, location, domain} {place, PopulatedPlace, populated, point, area, locality} {region, location, domain, PopulatedPlace, populated, place, point, area, locality} birthPlace:Honolulu {birthPlace, birth, place, beginning, point, area, locality} {place, PopulatedPlace, populated, point, area, locality} {birthPlace, birth, place, beginning, point, area, locality, PopulatedPlace, populated} vicePresident:Joe_Bi den {vicePresident, vice, president, corporate executive, head of state} {person, OfficeHolder, office, holder, organism, flesh, human body, occupation, job, staff, possessor, owner} {vicePresident, vice, president, corporate executive, head of state, person, OfficeHolder, office, holder, organism, flesh, human body, occupation, job, staff, possessor, owner} predecessor:George _W._Bush {predecessor, forerunner, precursor} {person, officeholder, office, holder, organism, flesh, human body, occupation, job, staff, possessor, owner} {predecessor, forerunner, precursor, person, officeholder, office, holder, organism, flesh, human body, occupation, job, staff, possessor, owner} Original sets are in orange color.
  • 19. 19 WordSet – How it helps for better grouping region: illinois vicePresident: Joe Biden birthPlace :Honolulu birthPlace :Honolulu region:illinois vicePresident:Joe Biden region, location, domain, PopulatedPlace, place, point, area, locality vicePresident, vice, president, corporate executive, head of state, person, OfficeHolder, human body, occupation, job, staff birthPlace: Honolulu birthPlace, birth, place, PopulatedPlace , point, area, locality, beginning
  • 20. 20 Ranking intuition workPlace: Washington D.C. residence: White House New York City Beavercreek Popular values Informative features
  • 21. o Influenced by tf-idf. o Inf(f): Informativeness of the feature (Uniqueness). – Example: residence-WhiteHouse o Po(v): Popularity of the value of the feature (frequent). – Example: WhiteHouse o Rank(f): Higher informativeness and popularity. 21 Ranking features N is the total number of entities
  • 22. 22 Faceted entity summary creation process (1) (2) (3) (4) (5) Semantic expansion Cobweb tf-idf based
  • 23. o Gold standard contains ideal summaries generated by 15 judges. o An Ideal summary for entity e is denoted by SummI. Then agreement is the overlap between ideal summaries. o Summary quality is the overlap between the computer generated summary (Summ(e)) and the ideal summaries for the entity. 23 Evaluation
  • 24. 24 Evaluation cont. • 50 entities in the gold standard. 69 users participated in the user preference evaluation. • On average, 44 features per entity. System Evaluation 1 – Gold Standard Evaluation 2 – User PreferenceSummary Length = 5 Summary Length = 10 Avg. Quality FACES % Gain Time/Entity Avg. Quality FACES % Gain Study 1 Study 2 FACES 1.4314 NA 0.76 sec 4.3350 NA 84% 54% RELIN 0.4981 187 % 10.96 sec 2.5188 72 % NA NA RELINM 0.6008 138 % 11.08 sec 3.0906 40 % 16 % 16 % SUMMARUM 1.2249 17 % NA 3.4207 27 % NA 30 % Avg. Agreement 1.9168 4.6415
  • 25. 25 FACES RELINM SUMMARUM birthPlace: Warsaw workInstitutions: University of Paris field: Physics spouse: Pierre Curie deathPlace: Passy, Haute-Savoie isPrimaryTopicOf: Marie_Curie wasDerivedFrom: oldid=547107936 knownFor: Polonium almaMater: ESPCI deathPlace: Passy, Haute-Savoie birthPlace: Poland birthPlace: Warsaw birthPlace: Russian_Empire field: Physics field: Chemistry 54 % 16 % 30 %
  • 26. 1. Knowledge on the Web and concise presentation 2. Diversity-aware entity summarization - Using hierarchical conceptual grouping. 3. Enriching knowledge graphs and entity summarization - Add type semantics to literals and adapt them in summarization. 4. Relatedness-based multi-entity summarization - Using quadratic multidimensional optimization techniques. 26 Talk overview
  • 27. o Lot of information encoded in literal format. – 1608 datatype properties (literal based) vs. 1103 object properties (entity based) in Dbpedia (2016-04) o Many literals can be easily typed for proper interpretation and use. – Example: in DBpedia, http://dbpedia.org/property/location has ~1,00,000 unique literals that can be directly mapped to entities. o Added semantics is useful in practical applications such as summarization, property alignment, data integration, and dataset profiling. 27 Typing literals (enriching) in knowledge graphs
  • 28. o FACES can only handle object property based features. o Our contributions*: 1. Compute types for the values of datatype property based features (data enrichment) - novel contribution. 2. Adapt and improve ranking algorithms (summarization). 28 Enrichment for entity summarization *[ESWC16] Kalpa Gunaratna, Krishnaprasad Thirunarayan, Amit Sheth, and Gong Cheng. 'Gleaning Types for Literals in RDF Triples with Application to Entity Summarization'. In Proc. 13th Extended Semantic Web Conference (ESWC 2016), 2016, pages 85-100. Barack Obama Person type “Michelle Obama” type String FACES Partitioning
  • 29. o Focus of the literal is not clear unlike URIs. o May contain several entities or labels matching ontology classes. 29 Challenges 44th President of the United States option 1 option 2 option 3
  • 30. 30 Enrichment outcomes dbr:Barack_Obama dbo:Politician “44th President of the United States”^^xsd:string dbr:Joe_Biden dbr:Barack_Obama dbp:short Description dbr:Calvin_Coolidge “48th Governor of Massachusetts”^^xsd:string dbo:orderInOffice dbo:Politician dbo:President dbo:Governor dbp:vicePresident rdf:type
  • 31. 31 Process flow N-grams Extractor Head-word Detector Entity Spotter Phrase Identifier Primary Type Filter N-grams + Head-word to Class Label Matcher Head-word Semantic Matcher Output types for the literal Pre-processing Type processing Head-word to Class Label Matcher Input literal
  • 32. 32 Type computation algorithm flow Extract n-grams & Find focus term Match focus term & ontology class Yes TYPE No Match n-grams, focus term & ontology class TYPE Yes No TYPE Yes No Match n-grams, focus term & entity label Get entity type Get similarity of focus term and all ontology classes TYPE For non-numeric literals Get max similarity ontology class Process focus term & ontology classes
  • 33. o Type Set TS(v) is the generated set of types for the value v. 33 Evaluation – type generation metrics n is the total number of features.
  • 34. o DBpedia Spotlight is taken as the baseline and there were 1117 unique property-value pairs (features). o 118 pairs (consisting of labelling properties and noisy features) were removed. 34 Evaluation Mean Precision (MP) Any Mean Precision (AMP) Coverage Our approach 0.8290 0.8829 0.8529 Baseline 0.4867 0.5825 0.5533
  • 36. o Ranking equations in the FACES approach do not work. – Two literals can be unique even if their types and the main entities are the same. • Example, “United States President” Vs. “President of the United States”. Not desirable to search using the whole phrase (syntactically different but semantically the same). – A literal can have several entities. Which one to choose? 36 Ranking datatype property features
  • 37. o Humans recognize popular entities. o Entities can be mentioned in literals with variations. o Proposal: Use the popular entities in literals and not the literals themselves for ranking. o Functions – Function ES(v) returns all entities present in the value v. – Function max(ES(v)) returns the most popular entity in ES(v). 37 Intuitions for ranking v = “44th President of the United States” ES(v) = {db:President, db:United States} max(ES(v)) = db:United States
  • 38. 38 Modified ranking equations Take the frequent entity for informativeness of feature and popularity of value.
  • 39. o Aggregate feature ranking scores for each facet. o Rank facets based on the aggregated scores. 39 Facet ranking Rank(f) is the original function and Rank(f)’ is the modified one for datatype property based features.
  • 40. 40 FACES-E entity summary generation (1) (2) (3) (4) (5) Semantic expansion + Type computation Cobweb tf-idf based
  • 41. o The gold standard consists of 20 random entities used in FACES taken from DBpedia 3.9 and 60 random entities taken from DBpedia 2015-04. o 17 human users created ideal summaries (total of 900). 41 Evaluation – FACES-E summary generation System Summary Length = 5 Summary Length = 10 Avg. Quality % Gain Avg. Quality % Gain FACES-E 1.5308 -- 4.5320 -- RELIN 0.9611 59 % 3.0988 46 % RELINM 1.0251 49 % 3.6514 24 % Avg. Agreement 2.1168 5.4363
  • 42. 1. Knowledge on the Web and concise presentation 2. Diversity-aware entity summarization - Using hierarchical conceptual grouping. 3. Enriching knowledge graphs and entity summarization - Add type semantics to literals and adapt them in summarization. 4. Relatedness-based multi-entity summarization - Using quadratic multidimensional optimization techniques. 42 Talk overview
  • 43. 43 Single vs. Multiple entity summarization Single entity summarization importance diversity importance diversity Multi-entity summarization importance diversity importance diversity Improve relatedness Apple Computer Steve Jobs Apple Computer Steve Jobs
  • 44. 44 Motivating example Within one month of the iPod nano and iTunes phone special event, Apple Computer announced today another special event to be held on October 12. It is to be held at the California Theater in downtown San Jose, California. The invitation reads, “One more thing …”, the teasing tagline of Steve Jobs. founders Steve_Jobs product IPod locationCity California industry Consumer_electronics after Tim_Cook knownFor Microcomputer_revolution title Apple_Inc. birthPlace California
  • 45. 45 Relatedness-based multi-entity summarization Entity 1 Entity 2 Entity 3 Entity 4 Problem: Given a collection of entities, we select features belonging to these entities maximizing inter-entity relatedness, intra-entity importance, and intra-entity diversity of features. Summary - - - - - - - - - - - - - - Summary - - - - - - - - - - - - - - Summary - - - - - - - - - - - - - - Summary - - - - - - - - - - - - - -
  • 46. 46 We have 3 optimization objectives Image credit: https://www.flickr.com/photos/68751915@N05/6551525739 Maximize intra-entity feature importance Maximize intra-entity feature diversity Maximize inter-entity feature relatedness
  • 47. 47 Formalizing Quadratic Multidimensional Knapsack Problem (QMKP) Variable x denotes whether the feature is selected or not We want to maximize the profit considering each knapsack size Entity 1 Entity 2 Entity 3 Entity 4 Summary - - - - - - - - - - - - - - Summary - - - - - - - - - - - - - - Summary - - - - - - - - - - - - - - Summary - - - - - - - - - - - - - -
  • 48. o We measure the importance of features using the informativeness and popularity measure used in FACES. o Within each entity summary, features should have higher importance. – Hence, we use a positive weight 48 1. Importance of features
  • 49. o Features consist of properties and values. o For properties, we use the expansion method used in FACES and calculate the Jaccard similarity for properties. o For values, we measure their relatedness using graph-based co-appearance for values. We use RDF2Vec model. o We combine the two measures and get the relatedness between two features. 49 How to measure relatedness of features
  • 50. o Each entity summary should have diverse features. – (i) Penalize relatedness score with a negative weight (i.e., maximize diversity). – (ii) Modify candidate feature selection to improve diversity. 50 2. Diversity of features within summaries
  • 51. o Maximize profit for related features between summaries. o Use a positive weight. 51 3. Relatedness of features between summaries
  • 52. o GRASP – Greedy Randomized Adaptive Search Procedure. o GRASP provides an approximate solution to QKP. – We simply use it for multiple constraints to suit QMKP. o We use a memory-based GRASP implementation version*. – Construction phase • Random selection of features (also using a greedy ranking function) – Local search phase • Tries to improve answer by replacing selected features – Update the best solution o To improve intra-entity diversity of features, we modified Restricted Candidate List (RCL) of GRASP. – We use a threshold to filter related features of the same entity. 52 GRASP – for combinatorial optimization * Yang, Zhen, Guoqing Wang, and Feng Chu. "An effective GRASP and tabu search for the 0–1 quadratic knapsack problem." Computers & Operations Research 40, no. 5 (2013): 1176-1185.
  • 53. o 15 judges, 2 datasets, 30 news items, and 850 question instances. o Qualitative evaluation. o Quantitative evaluation 53 Evaluation
  • 54. o Faceted entity summarization. – Conceptual (abstract) grouping of features/triples. – tf-idf based ranking. o Type computation for literals to enrich knowledge graphs. – Improve coverage for faceted entity summarization. o Relatedness-based multi entity summarization. 54 Conclusion Entity related structured data on the Web can be concisely and comprehensively summarized for efficient and convenient information presentation. This can be achieved through synergistic use of: (i) Unsupervised knowledge-based methods to conceptually group, (ii) Information Retrieval-based techniques to intuitively rank, (iii) Natural Language Processing techniques to semantically enrich structured data, and (iv) Combinatorial optimization techniques to handle relatedness of multiple entities. Thesis Statement
  • 55. 55 Research areas Semantic Web, Linked Data & Semantic Computing Publications Tool Patent
  • 56. o Conference Papers [WWW 2017] Hamid R. Motahari Nezhad, Kalpa Gunaratna, and Juan Cappi. “eAssistant: Cognitive Assistance for Identification and Auto-Triage of Actionable Conversations.” Proceedings of the 26th International Conference on World Wide Web Companion. International World Wide Web Conferences Steering Committee, 2017. [ESWC 2016] Kalpa Gunaratna, Krishnaprasad Thirunarayan, Amit Sheth, and Gong Cheng. “Gleaning Types for Literals in RDF Triples with Application to Entity Summarization”. In Proc. 13th Extended Semantic Web Conference (ESWC 2016), 2016, pages 85-100. DOI=10.1007/978-3-319-34129-3_6 [AAAI 2015] Kalpa Gunaratna, Krishnaprasad Thirunarayan, and Amit Sheth. “FACES: Diversity-Aware Entity Summarization using Incremental Hierarchical Conceptual Clustering”. 29th AAAI Conference on Artificial Intelligence (AAAI 2015), 2015. [Semantics 2013] Kalpa Gunaratna, Krishnaprasad Thirunarayan, Prateek Jain, Amit Sheth and Sanjaya Wijeratne. “A Statistical and Schema Independent Approach for Indentifying Equivalent Properties on Linked Data.” In Proc. 9th International Conference on Semantic Systems, ACM, 2013, pages 33-40. DOI=10.1145/2506182.2506187. o Articles [W 2014] Kalpa Gunaratna, Sarasi Lalithsena and Amit Sheth. “Alignment and Dataset Identification of Linked Data in Semantic Web.” Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery (2014). o Patents [P 2015] Kalpa Gunaratna, Hamid Motahari. Adaptive Learning of Actionable Statements in Natural Language Conversation. US patent filed, January 2016 (pending). o Edited proceedings [SumPre 2016] Andreas Thalhammer, Gong Cheng, Kalpa Gunaratna. Proceedings of the 2nd International Workshop on Summarizing and Presenting Entities and Ontologies (SumPre 2016) co-located with the 13th Extended Semantic Web Conference (ESWC 2016), Greece, May 30, 2016. CEUR Workshop Proceedings 1605, CEUR-WS.org 2016. [SumPre 2015] Gong Cheng, Kalpa Gunaratna, Andreas Thalhammer, Heiko Paulheim, Martin Voigt, Roberto Garca. Joint Proceedings of the 1st International Workshop on Summarizing and Presenting Entities and Ontologies and the 3rd International Workshop on Human Semantic Web Interfaces (SumPre 2015, HSWI 2015) co-located with the 12th Extended Semantic Web Conference (ESWC 2015), Portoroz, Slovenia, June 1, 2015. CEUR Workshop Proceedings 1556, CEUR-WS.org 2016. 56 Selected publications
  • 57. o Top-tier conference publications (AAAI-2015, ESWC-2016 , and WWW-2017). o Research internships at well-known places (INSIGHT-Ireland, NLM-USA, IBM-USA). o Co-chairing and organizing workshops at international conferences (SumPre2015 and SumPre2016 at ESWC). o PC member (e.g., ISWC15 and ESWC16) and W3C working group member (LDP14). o Competition winner (IBM Blockchain Hackathon runner-up, National Best Quality Software award finalist). o Travel and professional development grants (AAAI, WS-GSA). o US patent application (filed with IBM). 57 Selected accomplishments
  • 58. 58 Acknowledgements Prof. Amit Sheth (Advisor) Prof. Krishnaprasad Thirunarayan (Advisor) Dr. Edward Curry NUIG, Ireland Prof. Gong Cheng Nanjing University, China Dr. Hamid R. Motahari Nezhad IBM Research, USA Prof. Keke Chen
  • 59. 59 Thank You Dr. Olivier Bodenreider NLM, USA Dr. Ajith Ranabahu Amazon, USA Dr. Gamini Palihawadana My Family for always encouraging me … My colleagues at Kno.e.sis …
  • 60. 60 Thank You http://knoesis.wright.edu/researchers/kalpa kalpa@knoesis.org Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled Computing Wright State University, Dayton, Ohio, USA Questions ? All trademarks, logos, and images used in this presentation belong to their respective owners.

Notas del editor

  1. First triple is at the schema level and the second at the data level.
  2. 1146 datasets as of January 26 2017
  3. Facebook social graph IBM Watson knowledge graphs (health) Amazon product graph
  4. Datasets contain mere data without much processing and enhancements in the forms of semantics whereas knowledge graphs provide more semantics and knowledge. Entity – A real world thing (e.g., person, book, place) at the data level that encapsulates facts and is represented by a URI. Knowledge graph – A knowledge graphs is a collection of facts and rules that can also provide semantics.
  5. Rich Media Reference, created using “WorldModel” (an ontology) and “Knowledgebase” (data extracted from text).
  6. Grows over time: DBpedia (3.9) has around 200 triples on average per entity. 3.9  4 million entities 2014  4.5 million entities 2015  4.6 million entities 2017  6 million entities, 1.3 billion facts Number of facts continues grow. Hence need concise presentation.
  7. Grouping is challenging because # of groups for each entity is unknown
  8. Conceptually similar features are colored in the same color.
  9. Conceptual – uses probability based grouping. Incremental – special operators make it insensitive to order of items. Hierarchical – groups items in a tree structure. Cobweb uses probability in grouping facts and hence noted as “conceptual”. Fisher mention that this is similar to what humans do in grouping – pick the most probable group.
  10. Also we need groups that agree on concept level and not lexical level.
  11. All the expansions are not shown for clarity.
  12. Without WordSet and with WordSet
  13. Values are popular. More identifiable to humans. Property-value pairs are unique. Can distinctly identify the entity. Want to get a balance of both.
  14. When k > F(e) , picking facets to pick features is random and not proper. Extract features for the entity e. Enrich each feature and get the WordSet WS(f). Enriched feature set FS(e) is input to the partitioning algorithm and get facet set F(e). Get the feature ranking scores (Rank(f)) for each facet. Top ranked features from the facets are picked to form the faceted entity summary. The constraints defined in the definition for the faceted entity summary hold.
  15. Dataset dependent features were removed like owl:sameAs, wordnet_type, Wikipedia links, dcterms:subject, rdf:type,…
  16. All the three methods are automatic.
  17. 1600 vs 1079 in DBpedia 2015-04
  18. (i) the creator was unable to find a suitable entity URI for the object value, and hence chose to use a literal instead, (ii) the creator of the triple did not want to attach more details to the value and hence represented it in plain text, (iii) the value contains only basic implementation types like integer, boolean, and date, and hence not meaningful to create an entity, or (iv) the value has a lengthy description spanning several sentences (e.g., dbo:abstract property in DBpedia) that covers a diverse set of entities and facts.
  19. The literal can be long. In this work, we focus on one sentence long literals.
  20. Head word detection – Colin’s Head Word Detection algorithm. Directly matches head word to class Matches N-grams and head word to class label or else, match entities to N-grams and head word and then get the types. Semantic matcher of head word using UMBC matching service.
  21. n is bound to 3 in our DBpedia experiment For non-numeric strings, extract the n-grams. Get the focus term for the phrase. Check for a match in focus term and ontology class. If found  success Analyze n-grams that contain focus term (maximal match): Check for a match in n-grams and ontology class. If found  success Otherwise, check for a match in n-gram and entity label. Then get the types of the entity  success Finally, check similarity scores between the focus term and all the ontology classes. Get the ontology class that has the highest similarity score.  success
  22. Our finding is: “Typing needs to be handled carefully” Recall is not measured because it is hard to do so (check so many pairs).
  23. Inf(f)’ – count # of entities having the feature. Property should match but value has to contain the popular entity of the input feature’s value. Po(v)’ – count the number of triples that have the matching feature with most popular entity of the value.
  24. Extract features for the entity e. Enrich each feature and get the WordSet WS(f). Enriched feature set FS(e) is input to the partitioning algorithm and get facet set F(e). First get the feature ranking scores (R(f)) and then compute the facet ranking scores for each facet (FacetRank(F(e)). Top ranked features from top ranked facets in the order are picked to form the faceted entity summary. The constraints defined in the definition for the faceted entity summary hold.
  25. We can add inter-entity summary relatedness. This is not easy to achieve as we have to process multiple entities at the same time.
  26. Motivating example to show what we want to achieve in a multi-entity summarization scenario (main focus is for relatedness between summaries).
  27. Image icons from Google search (free to use)
  28. We consider weights of the features to be uniform in this case (= 1)
  29. Negative weighted score reduces the profit if related features are selected and hence avoids related things getting selected.
  30. GRASP uses pairwise profit matrix We filter features above max function from RCL, making sure less intra-entity relatedness
  31. UCI  word co-occurrence (w to refer to words in the equation) Umass  counts the number of documents containing both words (D to refer to documents in the equation)