SlideShare una empresa de Scribd logo
1 de 31
Summarizing Entity Descriptions for
Effective and Efficient
Human-centered Entity Linking
Gong Cheng, Danyun Xu, Yuzhong Qu
Websoft Research Group
State Key Laboratory for Novel Software Technology
Nanjing University, China
Entity Linking (EL)
But with the release of the iPhone 6
and the 6 Plus phablet, Apple has finally
gone into big-screen territory, giving
Samsung a challenge in the category
that the company has been dominating
for some time now.
Text Knowledge Base
iPhone 6
- type: Smartphone
- ...
Samsung Electronics
- type: IT Company
- ...
Apple (Inc.)
- type: Company
- type: IT company
- product: iPhone 5
- ...
Apple (Fruit)
- type: Fruit
- genus: Malus
- ...
?
Candidate entities
Human-centered EL is needed
But with the release of the iPhone 6
and the 6 Plus phablet, Apple has finally
gone into big-screen territory, giving
Samsung a challenge in the category
that the company has been dominating
for some time now.
Text Knowledge Base
iPhone 6
- type: Smartphone
- ...
Samsung Electronics
- type: IT Company
- ...
Apple (Inc.)
- type: Company
- type: IT company
- product: iPhone 5
- ...
Apple (Fruit)
- type: Fruit
- genus: Malus
- ...
?
Candidate entities
• for defining gold standard,
• for crowdsourced EL.
entity description:
set of property-value pairs (called features)
But with the release of the iPhone 6
and the 6 Plus phablet, Apple has finally
gone into big-screen territory, giving
Samsung a challenge in the category
that the company has been dominating
for some time now.
Text Knowledge Base
iPhone 6
- type: Smartphone
- ...
Samsung Electronics
- type: IT Company
- ...
Apple (Inc.)
- type: Company
- type: IT company
- product: iPhone 5
- ...
Apple (Fruit)
- type: Fruit
- genus: Malus
- ...
?
Candidate entities
Entity descriptions are long.
Short, extractive summaries are
adequate for human-centered EL.
Apple (Inc.)
- type: Company
- product: iPhone 5
Apple (Corps)
- type: Company
- product: Let It Be
Apple (Fruit)
- type: Fruit
summary of k candidate entity descriptions: k subsets of features (subject to a length limit)
?… Apple
Short, extractive summaries are
adequate for human-centered EL.
Apple (Inc.)
- type: Company
- product: iPhone 5
Apple (Corps)
- type: Company
- product: Let It Be
Apple (Fruit)
- type: Fruit
?… Apple
summarizing entity descriptions  combinatorial optimization
summary of k candidate entity descriptions: k subsets of features (subject to a length limit)
Optimization goal (1)
+characterizing power, -information overlap
• Characterizing power of a feature (ch)
ch(type: IT company) < ch(product: iPhone 5)
Apple (Inc.)
Samsung
Electronics
Apple (Inc.)
- type: Company
- type: IT company
- product: iPhone 5
- ...
Optimization goal (1)
+characterizing power, -information overlap
• Characterizing power of a feature (ch)
ch(type: IT company) < ch(product: iPhone 5)
Apple (Inc.)
Samsung
Electronics
𝑐ℎ 𝑓 = − log
number of entities having 𝑓
number of all entities
Apple (Inc.)
- type: Company
- type: IT company
- product: iPhone 5
- ...
Optimization goal (1)
+characterizing power, -information overlap
• Information overlap between features (ov)
a) logical inference
entailment = maximized ov
ov(type: IT company, type: Company) = MAX
b) string/numerical similarity
Apple (Inc.)
- type: Company
- type: IT company
- product: iPhone 5
- ...
Optimization goal (1)
+characterizing power, -information overlap
• Information overlap between features (ov)
a) logical inference
entailment  maximized ov
ov(type: IT company, type: Company) = MAX
b) string/numerical similarity
Apple (Inc.)
- type: Company
- type: IT company
- product: iPhone 5
- ...
Optimization goal (1)
+characterizing power, -information overlap
• Information overlap between features (ov)
a) logical inference
entailment  maximized ov
ov(type: IT company, type: Company) = MAX
b) string/numerical similarity
ov = max{similarity between properties, similarity between values}
ov(type: IT company, product: iPhone 5) = SMALL
Apple (Inc.)
- type: Company
- type: IT company
- product: iPhone 5
- ...
Optimization goal (1)
+characterizing power, -information overlap
• Formulated as k Quadratic Knapsack Problems (QKP)
weight of a feature: length
profit of a pair of features:
to maximize characterizing power
to minimize information overlap
Optimization goal (2): +differentiating power
• Differentiating power of a pair of features (di)
a) string/numerical dissimilarity
di = property’s value uniqueness * dissimilarity between values
di(type: IT company, type: Fruit) = SMALL*LARGE = MEDIUM
(Single-valued properties are more useful.)
b) logical inference
entailment = minimized di
di(type: IT company, type: Company) = MIN
Apple (Inc.)
- type: Company
- type: IT company
- product: iPhone 5
- ...
Apple (Fruit)
- type: Fruit
- genus: Malus
- ...
Samsung Electronics
- type: IT Company
- ...
Optimization goal (2): +differentiating power
• Differentiating power of a pair of features (di)
a) string/numerical dissimilarity
di = dissimilarity between values * property’s value uniqueness
di(type: IT company, type: Fruit) = LARGE*SMALL = MEDIUM
(Single-valued properties are more useful.)
b) logical inference
entailment = minimized di
di(type: IT company, type: Company) = MIN
Apple (Inc.)
- type: Company
- type: IT company
- product: iPhone 5
- ...
Apple (Fruit)
- type: Fruit
- genus: Malus
- ...
Samsung Electronics
- type: IT Company
- ...
Optimization goal (2): +differentiating power
• Differentiating power of a pair of features (di)
a) string/numerical dissimilarity
di = dissimilarity between values * property’s value uniqueness
di(type: IT company, type: Fruit) = LARGE*SMALL = MEDIUM
(Single-valued properties are more useful.)
b) logical inference
entailment  minimized di
di(type: IT company, type: Company) = MIN
Apple (Inc.)
- type: Company
- type: IT company
- product: iPhone 5
- ...
Apple (Fruit)
- type: Fruit
- genus: Malus
- ...
Samsung Electronics
- type: IT Company
- ...
Optimization goal (2): +differentiating power
• Formulated as a Quadratic Multidimensional
Knapsack Problem (QMKP)
weight of a feature: length
profit of a pair of features: differentiating power
Optimization goal (3): +relevance to context
• Relevance of a feature to the context of entity mention
• cosine similarity in the class vector model (cs)
Vector(context) = {Smarphone, IT company}
Vector(type: Fruit) = {Fruit}
Vector(product: iPhone 5) = {Smartphone}
cs(context, product: iPhone 5) = HIGH
• class weighting: class frequency – inverse instance frequency (CF-IIF)
But with the release of the iPhone 6
and the 6 Plus phablet, Apple has finally
gone into big-screen territory, giving
Samsung a challenge in the category
that the company has been dominating
for some time now.
Text Knowledge Base
iPhone 6
- type: Smartphone
- ...
Samsung Electronics
- type: IT Company
- ...
Apple (Inc.)
- type: Company
- type: IT company
- product: iPhone 5
- ...
Apple (Fruit)
- type: Fruit
- genus: Malus
- ...
?
Candidate entities
Optimization goal (3): +relevance to context
• Relevance of a feature to the context of entity mention
• cosine similarity in the class vector model (cs)
Vector(context) = {Smarphone, IT company}
Vector(type: Fruit) = {Fruit}
Vector(product: iPhone 5) = {Smartphone}
cs(context, product: iPhone 5) = HIGH
• class weighting: class frequency – inverse instance frequency (CF-IIF)
But with the release of the iPhone 6
and the 6 Plus phablet, Apple has finally
gone into big-screen territory, giving
Samsung a challenge in the category
that the company has been dominating
for some time now.
Text Knowledge Base
iPhone 6
- type: Smartphone
- ...
Samsung Electronics
- type: IT Company
- ...
Apple (Inc.)
- type: Company
- type: IT company
- product: iPhone 5
- ...
Apple (Fruit)
- type: Fruit
- genus: Malus
- ...
?
Candidate entities
Optimization goal (3): +relevance to context
• Relevance of a feature to the context of entity mention
• cosine similarity in the class vector model (cs)
Vector(context) = {Smarphone, IT company}
Vector(type: Fruit) = {Fruit}
Vector(product: iPhone 5) = {Smartphone}
cs(context, product: iPhone 5) = HIGH
• class weighting: class frequency – inverse instance frequency (CF-IIF)
But with the release of the iPhone 6
and the 6 Plus phablet, Apple has finally
gone into big-screen territory, giving
Samsung a challenge in the category
that the company has been dominating
for some time now.
Text Knowledge Base
iPhone 6
- type: Smartphone
- ...
Samsung Electronics
- type: IT Company
- ...
Apple (Inc.)
- type: Company
- type: IT company
- product: iPhone 5
- ...
Apple (Fruit)
- type: Fruit
- genus: Malus
- ...
?
Candidate entities
Optimization goal (3): +relevance to context
• Relevance of a feature to the context of entity mention
• cosine similarity in the class vector model (cs)
Vector(context) = {Smarphone, IT company}
Vector(type: Fruit) = {Fruit}
Vector(product: iPhone 5) = {Smartphone}
cs(context, product: iPhone 5) = HIGH
• class weighting: class frequency – inverse instance frequency (CF-IIF)
But with the release of the iPhone 6
and the 6 Plus phablet, Apple has finally
gone into big-screen territory, giving
Samsung a challenge in the category
that the company has been dominating
for some time now.
Text Knowledge Base
iPhone 6
- type: Smartphone
- ...
Samsung Electronics
- type: IT Company
- ...
Apple (Inc.)
- type: Company
- type: IT company
- product: iPhone 5
- ...
Apple (Fruit)
- type: Fruit
- genus: Malus
- ...
?
Candidate entities
Optimization goal (3): +relevance to context
• Relevance of a feature to the context of entity mention
• cosine similarity in the class vector model (cs)
Vector(context) = {Smarphone, IT company}
Vector(type: Fruit) = {Fruit}
Vector(product: iPhone 5) = {Smartphone}
cs(context, product: iPhone 5) = HIGH
• class weighting: class frequency – inverse instance frequency (CF-IIF)
But with the release of the iPhone 6
and the 6 Plus phablet, Apple has finally
gone into big-screen territory, giving
Samsung a challenge in the category
that the company has been dominating
for some time now.
Text Knowledge Base
iPhone 6
- type: Smartphone
- ...
Samsung Electronics
- type: IT Company
- ...
Apple (Inc.)
- type: Company
- type: IT company
- product: iPhone 5
- ...
Apple (Fruit)
- type: Fruit
- genus: Malus
- ...
?
Candidate entities
Optimization goal (3): +relevance to context
• Solved by k Maximizing Marginal Relevance (MMR)
frameworks
• Features are iteratively selected.
• In each iteration, candidate features are re-ranked by
• relevance to context
• dissimilarity to selected features
Optimization goal (1+2+3)
• Formulated as a Quadratic Multidimensional
Knapsack Problem (QMKP)
Experiments: data sets
• Text corpora (with entity mentions linked to Wikipedia)
• AQUAINT
• IITB
• Knowledge base
• DBpedia
• Gold-standard links
• entity mentions  Wikipedia articles  DBpedia entities
Experiments: EL tasks
Apple (Inc.)
- type: Company
- product: iPhone 5
Apple (Corps)
- type: Company
- product: Let It Be
Apple (Fruit)
- type: Fruit
?
..., Apple has finally gone
into big-screen territory, …
1 target entity
• gold-standard
2 (very challenging) noise entities
• sharing a common name with the target entity,
obtained from Wikipedia’s disambiguation pages
Experiments: approaches
• Proposed approaches
• CHR: +characterizing power, -information overlap
• DFF: +differentiating power
• CNT: +relevance to context
• COMB: CHR+DFF+CNT
• Baseline approaches
• DESC: returns entire entity descriptions
• RELIN: a state-of-the-art entity summarization approach for
generic purposes
• average length of entity descriptions: 680 characters
• length limit for summaries: 100 characters (14.7%)
Experiments: extrinsic evaluation
• COMB is the only approach that achieved the following
statistically significant results on both data sets:
• accuracy (% of correct answers): COMB = DESC
• time: COMB < DESC (22-23% faster)
Experiments: intrinsic evaluation
• Statistically significant results on both data sets:
• human ratings: COMB > CHR > other approaches
Future work
• More extensive experiments
• to test with not-in-the-list
• Summaries for automatic EL
Questions?

Más contenido relacionado

Similar a Summarizing Entity Descriptions for Effective and Efficient Human-centered Entity Linking

Presentation #1 – Discussion Questions (each question must be an.docx
Presentation #1 – Discussion Questions (each question must be an.docxPresentation #1 – Discussion Questions (each question must be an.docx
Presentation #1 – Discussion Questions (each question must be an.docx
harrisonhoward80223
 
Writing Sample - Equity Research - AAPL
Writing Sample - Equity Research - AAPLWriting Sample - Equity Research - AAPL
Writing Sample - Equity Research - AAPL
Michael Lin
 
3 P a g e Section 2 = Discussion Questions. Qu.docx
3  P a g e   Section 2 = Discussion Questions. Qu.docx3  P a g e   Section 2 = Discussion Questions. Qu.docx
3 P a g e Section 2 = Discussion Questions. Qu.docx
domenicacullison
 
General Environment, Forces of Competition, Future Improvement, Op.docx
General Environment, Forces of Competition, Future Improvement, Op.docxGeneral Environment, Forces of Competition, Future Improvement, Op.docx
General Environment, Forces of Competition, Future Improvement, Op.docx
shericehewat
 
Sheet1SWOT ScenarioCorp StrategyBusiness StrategyStrategy Implemen.docx
Sheet1SWOT ScenarioCorp StrategyBusiness StrategyStrategy Implemen.docxSheet1SWOT ScenarioCorp StrategyBusiness StrategyStrategy Implemen.docx
Sheet1SWOT ScenarioCorp StrategyBusiness StrategyStrategy Implemen.docx
maoanderton
 
Running head EXTERNAL ENVIRONMENT SCAN—APPLE .docx
Running head  EXTERNAL ENVIRONMENT SCAN—APPLE                    .docxRunning head  EXTERNAL ENVIRONMENT SCAN—APPLE                    .docx
Running head EXTERNAL ENVIRONMENT SCAN—APPLE .docx
joellemurphey
 

Similar a Summarizing Entity Descriptions for Effective and Efficient Human-centered Entity Linking (20)

Apple swot analysis 2016 (FREE)
Apple swot analysis 2016 (FREE)Apple swot analysis 2016 (FREE)
Apple swot analysis 2016 (FREE)
 
Apple manendra shukla
Apple manendra shuklaApple manendra shukla
Apple manendra shukla
 
Presentation #1 – Discussion Questions (each question must be an.docx
Presentation #1 – Discussion Questions (each question must be an.docxPresentation #1 – Discussion Questions (each question must be an.docx
Presentation #1 – Discussion Questions (each question must be an.docx
 
Writing Sample - Equity Research - AAPL
Writing Sample - Equity Research - AAPLWriting Sample - Equity Research - AAPL
Writing Sample - Equity Research - AAPL
 
Apple inc. Strategic Case Analysis
Apple inc. Strategic Case AnalysisApple inc. Strategic Case Analysis
Apple inc. Strategic Case Analysis
 
Apple CI Report 2015
Apple CI Report 2015Apple CI Report 2015
Apple CI Report 2015
 
The factors influencing the future business of apple
The factors influencing the future business of appleThe factors influencing the future business of apple
The factors influencing the future business of apple
 
3 P a g e Section 2 = Discussion Questions. Qu.docx
3  P a g e   Section 2 = Discussion Questions. Qu.docx3  P a g e   Section 2 = Discussion Questions. Qu.docx
3 P a g e Section 2 = Discussion Questions. Qu.docx
 
Apple Research Paper
Apple Research PaperApple Research Paper
Apple Research Paper
 
General Environment, Forces of Competition, Future Improvement, Op.docx
General Environment, Forces of Competition, Future Improvement, Op.docxGeneral Environment, Forces of Competition, Future Improvement, Op.docx
General Environment, Forces of Competition, Future Improvement, Op.docx
 
Apple Evolution
Apple EvolutionApple Evolution
Apple Evolution
 
APEX 5 Interactive Reports: Deep Dive and Upgrade Advice
APEX 5 Interactive Reports: Deep Dive and Upgrade AdviceAPEX 5 Interactive Reports: Deep Dive and Upgrade Advice
APEX 5 Interactive Reports: Deep Dive and Upgrade Advice
 
Apple Company Review, February 2017 from OLMA NEXT Ltd.
Apple Company Review, February 2017 from OLMA NEXT Ltd.Apple Company Review, February 2017 from OLMA NEXT Ltd.
Apple Company Review, February 2017 from OLMA NEXT Ltd.
 
Motivation
MotivationMotivation
Motivation
 
Apple Inc.
Apple Inc.Apple Inc.
Apple Inc.
 
Apple Inc.
Apple Inc.Apple Inc.
Apple Inc.
 
Sheet1SWOT ScenarioCorp StrategyBusiness StrategyStrategy Implemen.docx
Sheet1SWOT ScenarioCorp StrategyBusiness StrategyStrategy Implemen.docxSheet1SWOT ScenarioCorp StrategyBusiness StrategyStrategy Implemen.docx
Sheet1SWOT ScenarioCorp StrategyBusiness StrategyStrategy Implemen.docx
 
Apple Inc. Case Analysis
Apple Inc. Case AnalysisApple Inc. Case Analysis
Apple Inc. Case Analysis
 
Running head EXTERNAL ENVIRONMENT SCAN—APPLE .docx
Running head  EXTERNAL ENVIRONMENT SCAN—APPLE                    .docxRunning head  EXTERNAL ENVIRONMENT SCAN—APPLE                    .docx
Running head EXTERNAL ENVIRONMENT SCAN—APPLE .docx
 
Apple Inc. and it's implementation of IoT
Apple Inc. and it's implementation of IoTApple Inc. and it's implementation of IoT
Apple Inc. and it's implementation of IoT
 

Más de Gong Cheng

常识推理在地理自动答题中的需求分析
常识推理在地理自动答题中的需求分析常识推理在地理自动答题中的需求分析
常识推理在地理自动答题中的需求分析
Gong Cheng
 
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset SummarizationHIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
Gong Cheng
 
Taking up the Gaokao Challenge: An Information Retrieval Approach
Taking up the Gaokao Challenge: An Information Retrieval ApproachTaking up the Gaokao Challenge: An Information Retrieval Approach
Taking up the Gaokao Challenge: An Information Retrieval Approach
Gong Cheng
 

Más de Gong Cheng (20)

Towards Content-Based Dataset Search - Test Collections and Beyond
Towards Content-Based Dataset Search - Test Collections and BeyondTowards Content-Based Dataset Search - Test Collections and Beyond
Towards Content-Based Dataset Search - Test Collections and Beyond
 
从元数据到内容——新一代知识图谱搜索引擎初探
从元数据到内容——新一代知识图谱搜索引擎初探从元数据到内容——新一代知识图谱搜索引擎初探
从元数据到内容——新一代知识图谱搜索引擎初探
 
知识图谱中的实体摘要:基于神经网络的方法
知识图谱中的实体摘要:基于神经网络的方法知识图谱中的实体摘要:基于神经网络的方法
知识图谱中的实体摘要:基于神经网络的方法
 
Generating Compact and Relaxable Answers to Keyword Queries over Knowledge Gr...
Generating Compact and Relaxable Answers to Keyword Queries over Knowledge Gr...Generating Compact and Relaxable Answers to Keyword Queries over Knowledge Gr...
Generating Compact and Relaxable Answers to Keyword Queries over Knowledge Gr...
 
知识图谱中的关联搜索
知识图谱中的关联搜索知识图谱中的关联搜索
知识图谱中的关联搜索
 
面向高考机器人的知识表示与推理初探
面向高考机器人的知识表示与推理初探面向高考机器人的知识表示与推理初探
面向高考机器人的知识表示与推理初探
 
知识图谱中的实体关联搜索
知识图谱中的实体关联搜索知识图谱中的实体关联搜索
知识图谱中的实体关联搜索
 
Semantic Data Retrieval: Search, Ranking, and Summarization
Semantic Data Retrieval: Search, Ranking, and SummarizationSemantic Data Retrieval: Search, Ranking, and Summarization
Semantic Data Retrieval: Search, Ranking, and Summarization
 
Semantic Web related top conference review
Semantic Web related top conference reviewSemantic Web related top conference review
Semantic Web related top conference review
 
Relatedness-based Multi-Entity Summarization
Relatedness-based Multi-Entity SummarizationRelatedness-based Multi-Entity Summarization
Relatedness-based Multi-Entity Summarization
 
Generating Illustrative Snippets for Open Data on the Web
Generating Illustrative Snippets for Open Data on the WebGenerating Illustrative Snippets for Open Data on the Web
Generating Illustrative Snippets for Open Data on the Web
 
常识推理在地理自动答题中的需求分析
常识推理在地理自动答题中的需求分析常识推理在地理自动答题中的需求分析
常识推理在地理自动答题中的需求分析
 
Efficient Algorithms for Association Finding and Frequent Association Pattern...
Efficient Algorithms for Association Finding and Frequent Association Pattern...Efficient Algorithms for Association Finding and Frequent Association Pattern...
Efficient Algorithms for Association Finding and Frequent Association Pattern...
 
Summarizing Semantic Data
Summarizing Semantic DataSummarizing Semantic Data
Summarizing Semantic Data
 
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset SummarizationHIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
HIEDS: A Generic and Efficient Approach to Hierarchical Dataset Summarization
 
Taking up the Gaokao Challenge: An Information Retrieval Approach
Taking up the Gaokao Challenge: An Information Retrieval ApproachTaking up the Gaokao Challenge: An Information Retrieval Approach
Taking up the Gaokao Challenge: An Information Retrieval Approach
 
知识的摘要
知识的摘要知识的摘要
知识的摘要
 
Explass: Exploring Associations between Entities via Top-K Ontological Patter...
Explass: Exploring Associations between Entities via Top-K Ontological Patter...Explass: Exploring Associations between Entities via Top-K Ontological Patter...
Explass: Exploring Associations between Entities via Top-K Ontological Patter...
 
Facilitating Human Intervention in Coreference Resolution with Comparative En...
Facilitating Human Intervention in Coreference Resolution with Comparative En...Facilitating Human Intervention in Coreference Resolution with Comparative En...
Facilitating Human Intervention in Coreference Resolution with Comparative En...
 
Towards Exploratory Relationship Search: A Clustering-based Approach
Towards Exploratory Relationship Search: A Clustering-based ApproachTowards Exploratory Relationship Search: A Clustering-based Approach
Towards Exploratory Relationship Search: A Clustering-based Approach
 

Último

Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
Kayode Fayemi
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
raffaeleoman
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
Sheetaleventcompany
 

Último (20)

Presentation on Engagement in Book Clubs
Presentation on Engagement in Book ClubsPresentation on Engagement in Book Clubs
Presentation on Engagement in Book Clubs
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
 
Uncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac FolorunsoUncommon Grace The Autobiography of Isaac Folorunso
Uncommon Grace The Autobiography of Isaac Folorunso
 
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptxMohammad_Alnahdi_Oral_Presentation_Assignment.pptx
Mohammad_Alnahdi_Oral_Presentation_Assignment.pptx
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)
 
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
Governance and Nation-Building in Nigeria: Some Reflections on Options for Po...
 
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docxANCHORING SCRIPT FOR A CULTURAL EVENT.docx
ANCHORING SCRIPT FOR A CULTURAL EVENT.docx
 
ICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdfICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdf
 
My Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle BaileyMy Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle Bailey
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
 
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 93 Noida Escorts >༒8448380779 Escort Service
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510
 
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, YardstickSaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar Training
 
Causes of poverty in France presentation.pptx
Causes of poverty in France presentation.pptxCauses of poverty in France presentation.pptx
Causes of poverty in France presentation.pptx
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.
 
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
 
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfThe workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio III
 
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
No Advance 8868886958 Chandigarh Call Girls , Indian Call Girls For Full Nigh...
 

Summarizing Entity Descriptions for Effective and Efficient Human-centered Entity Linking

  • 1. Summarizing Entity Descriptions for Effective and Efficient Human-centered Entity Linking Gong Cheng, Danyun Xu, Yuzhong Qu Websoft Research Group State Key Laboratory for Novel Software Technology Nanjing University, China
  • 2. Entity Linking (EL) But with the release of the iPhone 6 and the 6 Plus phablet, Apple has finally gone into big-screen territory, giving Samsung a challenge in the category that the company has been dominating for some time now. Text Knowledge Base iPhone 6 - type: Smartphone - ... Samsung Electronics - type: IT Company - ... Apple (Inc.) - type: Company - type: IT company - product: iPhone 5 - ... Apple (Fruit) - type: Fruit - genus: Malus - ... ? Candidate entities
  • 3. Human-centered EL is needed But with the release of the iPhone 6 and the 6 Plus phablet, Apple has finally gone into big-screen territory, giving Samsung a challenge in the category that the company has been dominating for some time now. Text Knowledge Base iPhone 6 - type: Smartphone - ... Samsung Electronics - type: IT Company - ... Apple (Inc.) - type: Company - type: IT company - product: iPhone 5 - ... Apple (Fruit) - type: Fruit - genus: Malus - ... ? Candidate entities • for defining gold standard, • for crowdsourced EL.
  • 4. entity description: set of property-value pairs (called features) But with the release of the iPhone 6 and the 6 Plus phablet, Apple has finally gone into big-screen territory, giving Samsung a challenge in the category that the company has been dominating for some time now. Text Knowledge Base iPhone 6 - type: Smartphone - ... Samsung Electronics - type: IT Company - ... Apple (Inc.) - type: Company - type: IT company - product: iPhone 5 - ... Apple (Fruit) - type: Fruit - genus: Malus - ... ? Candidate entities
  • 6. Short, extractive summaries are adequate for human-centered EL. Apple (Inc.) - type: Company - product: iPhone 5 Apple (Corps) - type: Company - product: Let It Be Apple (Fruit) - type: Fruit summary of k candidate entity descriptions: k subsets of features (subject to a length limit) ?… Apple
  • 7. Short, extractive summaries are adequate for human-centered EL. Apple (Inc.) - type: Company - product: iPhone 5 Apple (Corps) - type: Company - product: Let It Be Apple (Fruit) - type: Fruit ?… Apple summarizing entity descriptions  combinatorial optimization summary of k candidate entity descriptions: k subsets of features (subject to a length limit)
  • 8. Optimization goal (1) +characterizing power, -information overlap • Characterizing power of a feature (ch) ch(type: IT company) < ch(product: iPhone 5) Apple (Inc.) Samsung Electronics Apple (Inc.) - type: Company - type: IT company - product: iPhone 5 - ...
  • 9. Optimization goal (1) +characterizing power, -information overlap • Characterizing power of a feature (ch) ch(type: IT company) < ch(product: iPhone 5) Apple (Inc.) Samsung Electronics 𝑐ℎ 𝑓 = − log number of entities having 𝑓 number of all entities Apple (Inc.) - type: Company - type: IT company - product: iPhone 5 - ...
  • 10. Optimization goal (1) +characterizing power, -information overlap • Information overlap between features (ov) a) logical inference entailment = maximized ov ov(type: IT company, type: Company) = MAX b) string/numerical similarity Apple (Inc.) - type: Company - type: IT company - product: iPhone 5 - ...
  • 11. Optimization goal (1) +characterizing power, -information overlap • Information overlap between features (ov) a) logical inference entailment  maximized ov ov(type: IT company, type: Company) = MAX b) string/numerical similarity Apple (Inc.) - type: Company - type: IT company - product: iPhone 5 - ...
  • 12. Optimization goal (1) +characterizing power, -information overlap • Information overlap between features (ov) a) logical inference entailment  maximized ov ov(type: IT company, type: Company) = MAX b) string/numerical similarity ov = max{similarity between properties, similarity between values} ov(type: IT company, product: iPhone 5) = SMALL Apple (Inc.) - type: Company - type: IT company - product: iPhone 5 - ...
  • 13. Optimization goal (1) +characterizing power, -information overlap • Formulated as k Quadratic Knapsack Problems (QKP) weight of a feature: length profit of a pair of features: to maximize characterizing power to minimize information overlap
  • 14. Optimization goal (2): +differentiating power • Differentiating power of a pair of features (di) a) string/numerical dissimilarity di = property’s value uniqueness * dissimilarity between values di(type: IT company, type: Fruit) = SMALL*LARGE = MEDIUM (Single-valued properties are more useful.) b) logical inference entailment = minimized di di(type: IT company, type: Company) = MIN Apple (Inc.) - type: Company - type: IT company - product: iPhone 5 - ... Apple (Fruit) - type: Fruit - genus: Malus - ... Samsung Electronics - type: IT Company - ...
  • 15. Optimization goal (2): +differentiating power • Differentiating power of a pair of features (di) a) string/numerical dissimilarity di = dissimilarity between values * property’s value uniqueness di(type: IT company, type: Fruit) = LARGE*SMALL = MEDIUM (Single-valued properties are more useful.) b) logical inference entailment = minimized di di(type: IT company, type: Company) = MIN Apple (Inc.) - type: Company - type: IT company - product: iPhone 5 - ... Apple (Fruit) - type: Fruit - genus: Malus - ... Samsung Electronics - type: IT Company - ...
  • 16. Optimization goal (2): +differentiating power • Differentiating power of a pair of features (di) a) string/numerical dissimilarity di = dissimilarity between values * property’s value uniqueness di(type: IT company, type: Fruit) = LARGE*SMALL = MEDIUM (Single-valued properties are more useful.) b) logical inference entailment  minimized di di(type: IT company, type: Company) = MIN Apple (Inc.) - type: Company - type: IT company - product: iPhone 5 - ... Apple (Fruit) - type: Fruit - genus: Malus - ... Samsung Electronics - type: IT Company - ...
  • 17. Optimization goal (2): +differentiating power • Formulated as a Quadratic Multidimensional Knapsack Problem (QMKP) weight of a feature: length profit of a pair of features: differentiating power
  • 18. Optimization goal (3): +relevance to context • Relevance of a feature to the context of entity mention • cosine similarity in the class vector model (cs) Vector(context) = {Smarphone, IT company} Vector(type: Fruit) = {Fruit} Vector(product: iPhone 5) = {Smartphone} cs(context, product: iPhone 5) = HIGH • class weighting: class frequency – inverse instance frequency (CF-IIF) But with the release of the iPhone 6 and the 6 Plus phablet, Apple has finally gone into big-screen territory, giving Samsung a challenge in the category that the company has been dominating for some time now. Text Knowledge Base iPhone 6 - type: Smartphone - ... Samsung Electronics - type: IT Company - ... Apple (Inc.) - type: Company - type: IT company - product: iPhone 5 - ... Apple (Fruit) - type: Fruit - genus: Malus - ... ? Candidate entities
  • 19. Optimization goal (3): +relevance to context • Relevance of a feature to the context of entity mention • cosine similarity in the class vector model (cs) Vector(context) = {Smarphone, IT company} Vector(type: Fruit) = {Fruit} Vector(product: iPhone 5) = {Smartphone} cs(context, product: iPhone 5) = HIGH • class weighting: class frequency – inverse instance frequency (CF-IIF) But with the release of the iPhone 6 and the 6 Plus phablet, Apple has finally gone into big-screen territory, giving Samsung a challenge in the category that the company has been dominating for some time now. Text Knowledge Base iPhone 6 - type: Smartphone - ... Samsung Electronics - type: IT Company - ... Apple (Inc.) - type: Company - type: IT company - product: iPhone 5 - ... Apple (Fruit) - type: Fruit - genus: Malus - ... ? Candidate entities
  • 20. Optimization goal (3): +relevance to context • Relevance of a feature to the context of entity mention • cosine similarity in the class vector model (cs) Vector(context) = {Smarphone, IT company} Vector(type: Fruit) = {Fruit} Vector(product: iPhone 5) = {Smartphone} cs(context, product: iPhone 5) = HIGH • class weighting: class frequency – inverse instance frequency (CF-IIF) But with the release of the iPhone 6 and the 6 Plus phablet, Apple has finally gone into big-screen territory, giving Samsung a challenge in the category that the company has been dominating for some time now. Text Knowledge Base iPhone 6 - type: Smartphone - ... Samsung Electronics - type: IT Company - ... Apple (Inc.) - type: Company - type: IT company - product: iPhone 5 - ... Apple (Fruit) - type: Fruit - genus: Malus - ... ? Candidate entities
  • 21. Optimization goal (3): +relevance to context • Relevance of a feature to the context of entity mention • cosine similarity in the class vector model (cs) Vector(context) = {Smarphone, IT company} Vector(type: Fruit) = {Fruit} Vector(product: iPhone 5) = {Smartphone} cs(context, product: iPhone 5) = HIGH • class weighting: class frequency – inverse instance frequency (CF-IIF) But with the release of the iPhone 6 and the 6 Plus phablet, Apple has finally gone into big-screen territory, giving Samsung a challenge in the category that the company has been dominating for some time now. Text Knowledge Base iPhone 6 - type: Smartphone - ... Samsung Electronics - type: IT Company - ... Apple (Inc.) - type: Company - type: IT company - product: iPhone 5 - ... Apple (Fruit) - type: Fruit - genus: Malus - ... ? Candidate entities
  • 22. Optimization goal (3): +relevance to context • Relevance of a feature to the context of entity mention • cosine similarity in the class vector model (cs) Vector(context) = {Smarphone, IT company} Vector(type: Fruit) = {Fruit} Vector(product: iPhone 5) = {Smartphone} cs(context, product: iPhone 5) = HIGH • class weighting: class frequency – inverse instance frequency (CF-IIF) But with the release of the iPhone 6 and the 6 Plus phablet, Apple has finally gone into big-screen territory, giving Samsung a challenge in the category that the company has been dominating for some time now. Text Knowledge Base iPhone 6 - type: Smartphone - ... Samsung Electronics - type: IT Company - ... Apple (Inc.) - type: Company - type: IT company - product: iPhone 5 - ... Apple (Fruit) - type: Fruit - genus: Malus - ... ? Candidate entities
  • 23. Optimization goal (3): +relevance to context • Solved by k Maximizing Marginal Relevance (MMR) frameworks • Features are iteratively selected. • In each iteration, candidate features are re-ranked by • relevance to context • dissimilarity to selected features
  • 24. Optimization goal (1+2+3) • Formulated as a Quadratic Multidimensional Knapsack Problem (QMKP)
  • 25. Experiments: data sets • Text corpora (with entity mentions linked to Wikipedia) • AQUAINT • IITB • Knowledge base • DBpedia • Gold-standard links • entity mentions  Wikipedia articles  DBpedia entities
  • 26. Experiments: EL tasks Apple (Inc.) - type: Company - product: iPhone 5 Apple (Corps) - type: Company - product: Let It Be Apple (Fruit) - type: Fruit ? ..., Apple has finally gone into big-screen territory, … 1 target entity • gold-standard 2 (very challenging) noise entities • sharing a common name with the target entity, obtained from Wikipedia’s disambiguation pages
  • 27. Experiments: approaches • Proposed approaches • CHR: +characterizing power, -information overlap • DFF: +differentiating power • CNT: +relevance to context • COMB: CHR+DFF+CNT • Baseline approaches • DESC: returns entire entity descriptions • RELIN: a state-of-the-art entity summarization approach for generic purposes • average length of entity descriptions: 680 characters • length limit for summaries: 100 characters (14.7%)
  • 28. Experiments: extrinsic evaluation • COMB is the only approach that achieved the following statistically significant results on both data sets: • accuracy (% of correct answers): COMB = DESC • time: COMB < DESC (22-23% faster)
  • 29. Experiments: intrinsic evaluation • Statistically significant results on both data sets: • human ratings: COMB > CHR > other approaches
  • 30. Future work • More extensive experiments • to test with not-in-the-list • Summaries for automatic EL