Improving Automatic Semantic Tag Recommendation through Fuzzy Ontologies

Improving Automatic Semantic Tag Recommendation
through Fuzzy Ontologies

Panos Alexopoulos, Manolis Wallace

7th International Workshop Semantic and Social Media Adaptation and
Personalization

Luxembourg, December 3-4, 2012

Agenda

 Introduction
 Problem Definition and Paper
Focus
 Approach Overview and Rationale

 Proposed Framework
 Tagging Evidence Model
 Tagging Process

 Framework Evaluation
 Evaluation Process
 Evaluation Results

 Conclusions and Future Work

2

Introduction
Problem Definition
Semantic Tagging

● Semantic tagging involves identifying and assigning to texts appropriate entities that
reflect what the document actually talks about.

● One important challenge in this task is the correct distinction between the entities that
play a central role to the document’s meaning and those that are just complementary
to it.

● For example, consider the following text:

● “Annie Hall is a much better movie than Deconstructing Harry, mainly
because Alvy Singer is such a well formed character and Diane Keaton gives
the performance of her life”.

● The text mentions two films, yet the one it actually talks about is only “Annie
Hall”, meaning that only this is an appropriate tag.

3

Introduction
Problem Definition
Semantic Tagging

● A second challenge is the inference of appropriate tags even when these are not
explicitly mentioned within the text.

● For example:

● “In June 1863, Colonel James Montgomery commanded a brigade in operations
along the coast resembling his earlier Jayhawk raids. The most famous of his
controversial operations was the Raid at Combahee Ferry in which 800 slaves
were liberated with the help of Harriett Tubman”.

● The text describes a historical battle which took place in Beaufort County,
South Carolina.

● This means that this location is an important geographical tag for the text, yet it
is not explicitly mentioned within it.

4

Introduction
Paper Focus

● In a previous work we have already proposed a framework for semantic tagging
through the exploitation of domain ontologies.

● The ontologies describe the domain(s) of the texts to be tagged and their entities
serve as a source of possible tags for them.

● The key idea is that a given ontological entity is more likely to represent the text’s
meaning (and thus be an accurate tag) when there are many ontologically related
to it entities in the text.

● In this paper we revise and extend the above framework so as to enable it to exploit
also fuzzy ontological information.

● Our assumption is that the fuzziness that may characterize some of the ontology’s
relations can increase the evidential power of its entities and consequently the
effectiveness of the tag recommendation process

5

Proposed Framework
Approach Overview and Rationale

● Our existing framework targets the task of semantic tagging based on the intuition
that a given ontological entity is more likely to represent the meaning the text when
there are many ontologically related to it entities in the text.

● E.g. in the example text the entities “Alvy Singer” and “Diane Keaton” indicate
that the text is about the film “Annie Hall”.

● That is because Alvy Singer is a character of this film and Diane Keaton an actor of
it.

● All these entities and the relations between them are derived from one or more
domain ontologies.

6

Proposed Framework

● The extension we propose to this framework has to do with considering, where
possible, fuzzy relations between entities rather than crisp ones.

● Fuzzy relations allow the assignment of truth degrees to vague ontological relations
in an effort to quantify their vagueness.

● E.g. Instead of ‘‘Annie Hall is a comedy” we may say that “Annie Hall is a
comedy to a degree of 0.7”

● Similarly, we may say that “Woody Allen is an expert director at human
relations to a degree of 0.8”.

● Thus, by using a fuzzy ontology one can represent useful semantic information for
the tag recommendation task in a higher level of granularity than with a crisp
ontology.

7

Proposed Framework

● For example, in the film domain, instead of having just the relation
hasPlayedInFilm it is more useful to have the fuzzy relation
wasAnImportantActorInFilm.

● E.g. “Robert Duvall was an important actor in Apocalypse Now to a degree of
0.6”.

● To see why this is useful consider the text “Robert Duvall’s brilliant performance in
the film showed that his choice by Francis Ford Copola was wise”.

● If Duvall and Copola have collaborated in more than one film but in only one of
them Duval had a major role (as captured by the fuzzy degree of his relation to the
film) then this film is more likely to be the subject of this text.

8

Proposed Framework
Framework Components

● Our proposed framework assumes the availability of a fuzzy ontology for the domain
of the texts to be tagged and defines two components:

● A Tag Fuzzy Ontological Evidence Model that contains entities that may
serve as tag-related evidence for the application scenario and domain at hand.

● Each entity is assigned evidential power degrees which denote its
usefulness as evidence for the tag recommendation task.

● A Tag Recommendation Process that uses the evidence model to determine,
for a given text, the ontological entities that potentially represent its content.

● A confidence score for each entity is used to denote the most probable
tags.

9

Proposed Framework
Tagging Evidence Model

● Defines for each ontology entity which other instances and to what extent should be
used as evidence towards the correct determination of the texts’ tags.

● It consists of entity pairs where a particular entity provides quantified evidence for a
another one.

10

Proposed Framework
Evidence Model Construction

● Construction of the evidence model depends on the characteristics of the domain and
the texts.

● The first step of the construction is manual and involves:

● The identification of the concepts whose instances are expected to be used as
tags (e.g. military conflicts, films etc.)

● The determination, for each of these concepts, of the related to them concepts
whose instances may serve as tag evidence:

● For example, in texts that review films, some concepts whose instances
may act as tag evidence are related directors, actors and characters.

● The identification, for each pair of evidence and target concept, of the fuzzy
relation paths that links them.

11

Proposed Framework

● The result of this first step is a tag evidence concept mapping like the following:

● This mapping is typically small so its manual construction is not difficult.

12

Proposed Framework

● Based on such mappings, the second step of the construction is automatic and
involves the generation of the tag-evidence entity pairs along with a tag evidential
strength.

● This strength is:
● Proportional to the fuzzy degree of the relation linking the evidence entity with
the tag.
● Inversely proportional to the evidential entity’s own ambiguity as well as to the
number and fuzzy degrees of the other tags it provides evidence for.

● For example, “Woody Allen” provides evidence for the film “Annie Hall” to a strength
of 0.02 because he has directed many other films while the character “Alvy Singer”
has evidential strength of 1 as it appears only on this film.

13

Proposed Framework
Tag Recommendation Process

● Step 1: We extract from the text the terms that possibly refer to tag entities.

● Step 2: We extract from the text the terms that possibly refer to evidential entities

● Step 3: We consider as candidate tag entities not only those found within the text but
practically all those that are related to instances of the evidential concepts in the
ontology.

● E.g. If we find the term “Woody Allen” then all his films are candidate tag
entities.

● Step 4: Using the evidence model of the previous slide we compute for each
candidate tag entity the confidence that it actually represents the text’s meaning.

● Note: The evidence model is assumed to have been calculated offline and stored in
an index, so as to make the above process more efficient.

14

Framework Evaluation
Evaluation Process
Description

● Two tagging scenarios:
● Film reviews.
● Texts describing military conflicts.

● Fuzzy ontologies for both domains, based on the manual fuzzification of a small
portion of DBPedia and Freebase semantic data.

● Effectiveness was measured by determining the number of correctly tagged texts,
namely texts whose highest ranked tags were the correct ones

15

Evaluation Process
Film Reviews Scenario Miltary Conflicts Scenario

● 100 texts describing 20 distinct films ● 100 texts describing 100 miltary
that were similar to each other in terms conflicts that were similar to each other
of genre, actors and directors and thus in terms of participants and places
more difficult to distinguish between
them in a given review.

● Fuzzy Film Ontology: ● Fuzzy Conflict Ontology:

● Concepts: Film, Actor, Director, ● Concepts: Location, Military
Character Conflict, Military Person

● Relations: ● Relations:
wasAnImportantActorInFilm, tookPlaceNearLocation,
isFamousForDirectingFilm, wasAnImportantPartOfConflict,
wasCharacterInFilm. playedMajorRoleInConflict,
isNearToLocation

16

Evaluation Results

● We measured tagging effectiveness by determining the number of correctly tagged
texts, namely texts whose highest ranked films were the correct ones.

● For comparison purposes, we performed the same process using a crisp version of
the ontologies.

● Results:

17

Conclusions and Future Work
Key Points

● We proposed a novel framework that exploits fuzzy semantic information for
automatically generating semantic tags for text documents

● This had two challenges:
● Distinguishing correctly between the entities that play a central role to the
document’s meaning and those that are just complementary to it.
● Inferring appropriate tags even when these are not explicitly mentioned within
the text.

● Our approach has been based on the customized utilization of fuzzy domain-
specific ontological relations for extracting and evaluating tag “evidence” from within
the text

● The added value that the consideration and exploitation of fuzziness brought to the
tag recommendation task was experimentally verified through experiments in
different domains

18

Conclusions and Future Work
Framework Extensions

● One important obstacle for the wider applicability of our approach is the bottleneck of
acquiring (through development or reuse) the required fuzzy ontological information
for the domain at hand.

● For that reason, our future work will focus on determining automated methods for
fuzzifying crisp ontological facts:

● Data mining
● Social network analysis
● Crowdsourcing

19

Thank you!
Contact iSOCO

Dr. Panos Alexopoulos
Senior Researcher
palexopoulos@isoco.com

Questions?

Barcelona Madrid Pamplona Valencia
Tel +34 935 677 200 Tel +34 913 349 797 Tel +34 948 102 408 Tel +34 963 467 143
Edificio Testa A Av. del Partenón, 16-18, 1º7ª Parque Tomás Oficina 107
C/ Alcalde Barnils, 64-68 Campo de las Naciones Caballero, 2, 6º-4ª C/ Prof. Beltrán Báguena, 4
St. Cugat del Vallès 28042 Madrid 31006 Pamplona 46009 Valencia
08174 Barcelona

20

Improving Automatic Semantic Tag Recommendation through Fuzzy Ontologies

Recomendados

Recomendados

Más contenido relacionado

Similar a Improving Automatic Semantic Tag Recommendation through Fuzzy Ontologies

Similar a Improving Automatic Semantic Tag Recommendation through Fuzzy Ontologies (17)

Más de Panos Alexopoulos

Más de Panos Alexopoulos (9)

Último

Último (20)

Improving Automatic Semantic Tag Recommendation through Fuzzy Ontologies