Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Harvesting Knowledge from Social
Networks:
Extracting Typed Relationships among
Entities
Andrea Caielli, Marco Brambilla, Stefano Ceri, Florian Daniel
marco.brambilla@polimi.it
marcobrambi
SoWeMine Workshop @ ICWE 2017, Rome, Italy

Agenda
(1)Context
(2)Objectives
(3)Method
(4)Experiments and Validation
(5)Visualization and Exploration
(6)Conclusions

Ontology is the philosophical study of
the nature of being, becoming,
existence or reality
and the basic categories of being and their
relations.

Formalizing new knowledge is hard
Only high frequency emerges
The long tail challenge

Sourcing the Long Tail
Famous Emerging
…

Objective
Extraction of relationships
among entities
Reconstruct a typed graph of entities & relationships
Represent the knowledge contained in social data
No need for a-priori domain knowledge

Knowledge Enrichment Setting
HF Entity1 HF Entity5
HF Entity2 HF Entity4
HF Entity3
LF Entity1
??
LF Entity2 LF Entity4
LF Entity3
??
High Frequency
Entities
Low Frequency
Entities
??
?? ????
??
Type1
Type11
Type2
Type111
Instances
Types
<<instanceof>>
<<instanceof>>
<<instanceof>>
<<instanceof>>
<<instanceof>>
<<instanceof>>
??
??
??
??
??
Seed Entity
Seed Type
Type of
interest
Legend
Expert inputs
Enrichment problems
Property2
Relations HF - LF entities
Relations LF - LF entities
Typing of LF entities
Extraction of new LF entities
Property1
?? ?? ??
Finding attribute values

Challenge and Innovation
Highly unstructured social data
(tweets and Facebook posts)
No reliable grammar structures

Analysis Pipeline
(0) Preprocessing
(1) Entity Extraction
(2) Relationship Extraction
(3) Relationship Aggregation
(4) Relationship Typing
(1) Evolution of work presented in:
M. Brambilla, S. Ceri, E. Della Valle, R. Volonterio, and F. Acero Salazar.
“Extracting Emerging Knowledge from Social Media”, WWW 2017.

(0) Preprocessing
Text cleaning and enrichment
+ Traditional text preprocessing (stemming, …)

(1) Entity Extraction
Entity identification and semantic typing
Exploiting:
Stanford
CoreNLP
NER
Dandelion
API

(2) Relationship Extraction
Baseline with Stanford OpenIE for triple extraction:
Several issues:
- Meaningless relations
- Wrong relations
- Multiple relations

(3) Relationship Aggregation
Sails fans. Season 2 airs on May 24th on History on D Stv Jag Comms
Too many answers
for the same question!
Empirical rules
{"entity1":"Season 2",
"relationship":"air on",
"entity2":"May 24th"}

(4) Relationship Typing (A): Synonyms
Exploiting synsets based on WordNet 3.1

(4) Relationship Typing (B): Matching
Types

(4) Relationship Typing (C): Linguistics
Based on VerbNet
Groupings of verbs based on syntactic and semantic properties

Experiments
TV Series: Black Salis, Teen Wolf, Vikings
Milan Fashion Week
Rugby games

Domains and quality of results -
summary

Relationships and Verb Classes

Example: Teen Wolf
0
100
200
300
400
500
600
700
800
Occurrences
Teen Wolf Synonyms Classes

Example: Teen Wolf
0
100
200
300
400
500
600
700
800
Occurrences
Teen Wolf Synonyms Classes
OCCURRENCES
TEEN WOLF VERBNET CLASSES

Overall Quality Indexes of
Entity and Relationships Extraction

Motivation
Resulting semantic
models extremely
large and hard to
interpret
Example:
Black Sails collection,
containing 1243 entities
and 2025 relations.

Exploration
Visualization
Filtering
Navigation

Exploration
Visualization
RELATIONSHIP Filtering
Navigation

Examples
Milano
Fashion
Week
Generate
d graph

Examples
Milano
Fashion
Week
Generated
graph

Conclusions
Extraction of relevant emerging relationships
feasible even in case of extremely unstructured
and informal content (social media)
Still a long way to perfect extraction:
•N-ary relations
•Time-dependency
•Poor typing of entities in ontologies

THANKS!
QUESTIONS?
Andrea Caielli, Marco Brambilla, Stefano Ceri, Florian Daniel
Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities
Marco Brambilla @marcobrambi marco.brambilla@polimi.it
http://datascience.deib.polimi.it http://home.deib.polimi.it/marcobrambi

Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Recomendados

Recomendados

Más contenido relacionado

Similar a Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities

Similar a Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities (20)

Más de Marco Brambilla

Más de Marco Brambilla (20)

Último

Último (20)

Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities