Knowledge bases like DBpedia, Yago or Google's Knowledge
Graph contain huge amounts of ontological knowledge harvested from
(semi-)structured, curated data sources, such as relational databases or
XML and HTML documents. Yet, the Web is full of knowledge that is
not curated and/or structured and, hence, not easily indexed, for ex-
ample social data. Most work so far in this context has been dedicated
to the extraction of entities, i.e., people, things or concepts. This poster
describes our work toward the extraction of relationships among entities.
The objective is reconstructing a typed graph of entities and relation-
ships to represent the knowledge contained in social data, without the
need for a-priori domain knowledge. The experiments with real datasets
show promising performance across a variety of domains.
The key distinguishing
feature of the work is its focus on highly unstructured social data (tweets and
Facebook posts) without reliable grammar structures. Traditional relation extraction approaches supervised , semi-supervised or unsupervised,
commonly assume the availability of grammatically correct language corpora.
Unveiling the Role of Social Media Suspect Investigators in Preventing Online...
Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities
1. Harvesting Knowledge from Social
Networks:
Extracting Typed Relationships among
Entities
Andrea Caielli, Marco Brambilla, Stefano Ceri, Florian Daniel
marco.brambilla@polimi.it
marcobrambi
SoWeMine Workshop @ ICWE 2017, Rome, Italy
8. Objective
Extraction of relationships
among entities
Reconstruct a typed graph of entities & relationships
Represent the knowledge contained in social data
No need for a-priori domain knowledge
9. Knowledge Enrichment Setting
HF Entity1 HF Entity5
HF Entity2 HF Entity4
HF Entity3
LF Entity1
??
LF Entity2 LF Entity4
LF Entity3
??
High Frequency
Entities
Low Frequency
Entities
??
?? ????
??
Type1
Type11
Type2
Type111
Instances
Types
<<instanceof>>
<<instanceof>>
<<instanceof>>
<<instanceof>>
<<instanceof>>
<<instanceof>>
??
??
??
??
??
Seed Entity
Seed Type
Type of
interest
Legend
Expert inputs
Enrichment problems
Property2
Relations HF - LF entities
Relations LF - LF entities
Typing of LF entities
Extraction of new LF entities
Property1
?? ?? ??
Finding attribute values
14. Analysis Pipeline
(0) Preprocessing
(1) Entity Extraction
(2) Relationship Extraction
(3) Relationship Aggregation
(4) Relationship Typing
(1) Evolution of work presented in:
M. Brambilla, S. Ceri, E. Della Valle, R. Volonterio, and F. Acero Salazar.
“Extracting Emerging Knowledge from Social Media”, WWW 2017.
18. (2) Relationship Extraction
Baseline with Stanford OpenIE for triple extraction:
Several issues:
- Meaningless relations
- Wrong relations
- Multiple relations
19. (3) Relationship Aggregation
Sails fans. Season 2 airs on May 24th on History on D Stv Jag Comms
Too many answers
for the same question!
Empirical rules
{"entity1":"Season 2",
"relationship":"air on",
"entity2":"May 24th"}
41. Conclusions
Extraction of relevant emerging relationships
feasible even in case of extremely unstructured
and informal content (social media)
Still a long way to perfect extraction:
•N-ary relations
•Time-dependency
•Poor typing of entities in ontologies
42. THANKS!
QUESTIONS?
Andrea Caielli, Marco Brambilla, Stefano Ceri, Florian Daniel
Harvesting Knowledge from Social Networks: Extracting Typed Relationships among Entities
Marco Brambilla @marcobrambi marco.brambilla@polimi.it
http://datascience.deib.polimi.it http://home.deib.polimi.it/marcobrambi