Enhancing the Navigability of Social Tagging Systems with Tag Taxonomies
1. Enhancing the Navigability of Social Tagging Systems
with Tag Taxonomies
Christoph Trattner & Christian K¨rner & Denis Helic
o
KMI, TU Graz
September 8, 2011
Christoph Trattner & Christian K¨rner & Denis Helic (KMI, Navigability of Social Tagging Systems with Tag Taxonomies
o Enhancing the TU Graz) September 8, 2011 1 / 26
2. Introduction
“Tagging gained tremendously in popularity over the past few years”
Christoph Trattner & Christian K¨rner & Denis Helic (KMI, Navigability of Social Tagging Systems with Tag Taxonomies
o Enhancing the TU Graz) September 8, 2011 2 / 26
3. Introduction
Figure: Tags on Flickr
Christoph Trattner & Christian K¨rner & Denis Helic (KMI, Navigability of Social Tagging Systems with Tag Taxonomies
o Enhancing the TU Graz) September 8, 2011 3 / 26
4. Introduction
Figure: Tags on Amazon
Christoph Trattner & Christian K¨rner & Denis Helic (KMI, Navigability of Social Tagging Systems with Tag Taxonomies
o Enhancing the TU Graz) September 8, 2011 4 / 26
5. Introduction
Figure: Tags on LastFM
Christoph Trattner & Christian K¨rner & Denis Helic (KMI, Navigability of Social Tagging Systems with Tag Taxonomies
o Enhancing the TU Graz) September 8, 2011 5 / 26
6. Introduction
What we also like about tags, apart form the fact that they represent
a cheap and light-weight alternative to common key-word based
semantic enrichment, is the fact that they allow us to invent tools to
explore or navigate an information system in a light-weight and
concept driven manner.
A popular example of such a tool are tag taxonomies!
Christoph Trattner & Christian K¨rner & Denis Helic (KMI, Navigability of Social Tagging Systems with Tag Taxonomies
o Enhancing the TU Graz) September 8, 2011 6 / 26
7. Introduction
Q: What is a tag taxonomy?
A: A tool that allows us to navigate information items in an
information system in a concept driven and hierarchical manner.
Figure: Tag Taxonomy
Christoph Trattner & Christian K¨rner & Denis Helic (KMI, Navigability of Social Tagging Systems with Tag Taxonomies
o Enhancing the TU Graz) September 8, 2011 7 / 26
8. Introduction
Popular examples of tag taxonomy induction algorithms are:
The graph based approach of Heymann (Heymann et al. 2009)
Affinity Propagation (Lerman et al. 2010)
Hierarchical K-Means (Dhillon et al. 2001)
Christoph Trattner & Christian K¨rner & Denis Helic (KMI, Navigability of Social Tagging Systems with Tag Taxonomies
o Enhancing the TU Graz) September 8, 2011 8 / 26
9. Why usefulness of tag taxonomies for navigation is limited?
What we also observed in recent research regarding tagging is the fact
that tag based navigation has also it’s limitations (Helic et al. 2010).
The problem with tagging is basically the fact that people do not
apply tags to all resources of an information system system in a
uniform manner.
Christoph Trattner & Christian K¨rner & Denis Helic (KMI, Navigability of Social Tagging Systems with Tag Taxonomies
o Enhancing the TU Graz) September 8, 2011 9 / 26
10. Why usefulness of tag taxonomies for navigation is limited?
Actually, it was observed (H. Halpin et al. 2007) that the tag distribution
of almost all tagging systems follows a power-law function, i.e. there are
many tags that refer to a large number of resources.
(a) Austria-Forum (b) BibSonomy (c) CiteULike
Figure: Tag distributions.
Christoph Trattner & Christian K¨rner & Denis Helic (KMI, Navigability of Social Tagging Systems with Tag Taxonomies
o Enhancing the TU Graz) September 8, 2011 10 / 26
11. Why usefulness of tag taxonomies for navigation is limited?
Hence, to navigate from one resource to another resource in an
information system with the help of a tag taxonomy the user would have to
click many many times in the worst case to reach a desired target resource.
Figure: Result list of the tag “blog” in the bookmarking system Delicious.
Christoph Trattner & Christian K¨rner & Denis Helic (KMI, Navigability of Social Tagging Systems with Tag Taxonomies
o Enhancing the TU Graz) September 8, 2011 11 / 26
12. Why usefulness of tag taxonomies for navigation is limited?
Now, to support the user in the process to also navigate to the
resources of a tagging system in an efficient manner, we invented the
approach of the so-called tag-resource taxonomies.
Car
Car
Tire Motor
Tire Motor
Mercedes VOLVO VW BMW
VW BMW VW BMW
(a) Tag Taxonomy (b) Tag-Resource Taxonomy
Figure: Tag Taxonomy vs. Tag-Resource Taxonomy.
The beauty of such tag-resource hierarchies is that the result lists are
limited to a certain branching factor b and the maximum number of clicks
is bounded by log(n), where n are the number of resources.
Christoph Trattner & Christian K¨rner & Denis Helic (KMI, Navigability of Social Tagging Systems with Tag Taxonomies
o Enhancing the TU Graz) September 8, 2011 12 / 26
13. Why usefulness of tag taxonomies for navigation is limited?
Sample calculations of a tag taxonomy vs. a tag-resource taxonomy for
the max number of clicks for three different tagging datasets with
branching factor b = 10.
Austria-Forum BibSonomy CiteULike
max{click(Ttag )} 184 5,278 20,799
max{click(Tres )} 6.1 7.7 8.5
Table: Tag Taxonomy vs. Tag-Resource Taxonomy.
Christoph Trattner & Christian K¨rner & Denis Helic (KMI, Navigability of Social Tagging Systems with Tag Taxonomies
o Enhancing the TU Graz) September 8, 2011 13 / 26
14. Why usefulness of tag taxonomies for navigation is limited?
Sample calculations of a tag taxonomy vs. a tag-resource taxonomy for
the mean number of clicks for three different tagging datasets with
branching factors ranging from b = 2 − 10.
b Austria-Forum BibSonomy CiteULike
mean{click(Tres )} 2 14.2 17.8 19.8
mean{click(Ttag )} 2 29.5 22.4 30.7
mean{click(Tres )} 5 6.1 7.6 8.5
mean{click(Ttag )} 5 11.6 9.2 12.3
mean{click(Tres )} 10 4.3 5.3 5.9
mean{click(Ttag )} 10 6.4 5.6 7.3
Table: Tag Taxonomy vs. Tag-Resource Taxonomy.
Christoph Trattner & Christian K¨rner & Denis Helic (KMI, Navigability of Social Tagging Systems with Tag Taxonomies
o Enhancing the TU Graz) September 8, 2011 14 / 26
15. Creating tag-resource Taxonomies
“How do we create tag-resource hierarchies?”
Christoph Trattner & Christian K¨rner & Denis Helic (KMI, Navigability of Social Tagging Systems with Tag Taxonomies
o Enhancing the TU Graz) September 8, 2011 15 / 26
16. Creating tag-resource Taxonomies
Actually, the first step to create a tag-resource hierarchy is to create a
resource hierarchy out of a tagging dataset.
1. Computer Degree centrality for each resource of the tagging
dataset and take the most general resource as our root
2. Compute cosine-similarity for all resources that are related to the
root node
3. Re-rank nodes according to their cosine*centrality values
4. Attach max. b resources as childs to the root.
5. Set next child as root and go to step 2.
Christoph Trattner & Christian K¨rner & Denis Helic (KMI, Navigability of Social Tagging Systems with Tag Taxonomies
o Enhancing the TU Graz) September 8, 2011 16 / 26
17. Creating tag-resource Taxonomies
To generate the actual tag-resource taxonomy we invented a hierarchical
labeling algorithm. Basically the algorithm works as follows:
1. Traverse the resource taxonomy in left-order and calculate a
co-occurance vector for the currently processed resource.
2. Remove all tags from the co-occ. vector that are not in the tag set
of the currently processed resource.
3. Try to apply most general tag of the co-ooc. vector. If the
candidate tag has already been applied to one of the parent resources
of the currently processed resource, take the next candidate tag from
the co-occ. vector.
Christoph Trattner & Christian K¨rner & Denis Helic (KMI, Navigability of Social Tagging Systems with Tag Taxonomies
o Enhancing the TU Graz) September 8, 2011 17 / 26
18. Evaluating Tag-Resource Taxonomies
In order to evaluate our approach, we conducted basically 3 different
experiments
As dataset for our analysis we used a tagging dataset from a large
Wiki based information system called the Austria-Forum.
Christoph Trattner & Christian K¨rner & Denis Helic (KMI, Navigability of Social Tagging Systems with Tag Taxonomies
o Enhancing the TU Graz) September 8, 2011 18 / 26
19. Evaluating Tag-Resource Taxonomies
Since our tag-taxonomy induction algorithm is not to 100% free of
collisions, we conducted a simple experiment were we measured the
number of collisions that occur during the labeling process.
Example of a collision: car > bmw > bmw
For that purpose we generated three different tag-resource
taxonomies with different branching factors ranging from b = 2 − 10
and investigated the collision rate.
Name b n CR (%)
Res2 2 19,430 0.1%
Res5 5 19,430 0.2%
Res10 10 19,430 0.2%
Table: Collision Rates (CR) for different resource taxonomies with different
branching factor b.
Christoph Trattner & Christian K¨rner & Denis Helic (KMI, Navigability of Social Tagging Systems with Tag Taxonomies
o Enhancing the TU Graz) September 8, 2011 19 / 26
20. Evaluating Tag-Resource Taxonomies
In the second experiment we measured the semantic structure of the
tag-resource taxonomy compared to popular tag taxonomy induction
algorithms such as Heymann, K-Means, Affinity Propagation and
Co-Occurance
As measure for this experiment we used Taxonomic Recall/Prec. and
Overlap.
As Ground truth we used the Germanet ontholoy
For the experiment we again generated three different tag-resource
taxonomies with different branching factors b.
Christoph Trattner & Christian K¨rner & Denis Helic (KMI, Navigability of Social Tagging Systems with Tag Taxonomies
o Enhancing the TU Graz) September 8, 2011 20 / 26
21. Evaluating Tag-Resource Taxonomies
0.4
Taxonomic F−Measure
0.35 Taxonomic Overlap
0.3
Count (1 = 100%)
0.25
0.2
0.15
0.1
0.05
0
Res2 Res5 Res10 Deg/Cooc Aff. Prop K−Means Heymann
Figure: Results of the semantic evaluation of the three generated tag-resource
taxonomies Res2, Res5 and Res10.
Christoph Trattner & Christian K¨rner & Denis Helic (KMI, Navigability of Social Tagging Systems with Tag Taxonomies
o Enhancing the TU Graz) September 8, 2011 21 / 26
22. Evaluating Tag-Resource Taxonomies
In the third and last experiment a user study was conducted to
evaluate weather our approach is also useful for humans and could be
used in a practical setting
To compare our approach against a golden standard we used for the
experiment so far best known tag taxonomy induction algorithm
(Deg/Cooc)
To measure the performance of our approach, we invited 9 test users
to judge 200 tag trails extracted from both hierarchies
Christoph Trattner & Christian K¨rner & Denis Helic (KMI, Navigability of Social Tagging Systems with Tag Taxonomies
o Enhancing the TU Graz) September 8, 2011 22 / 26
23. Evaluating Tag-Resource Taxonomies
To ensure that the user would not know which trail she is actually
judging, we mixed the trails up uniform at random
To actually evaluate the trails, we asked our test users to start from
the most left concept and to move on to the most right concept in
the trail
The evaluation schema given to the user was the following:
Classification Description
Correct Correct hierarchy relation
Related Correct relation, but not hierarchical
or reverse hierarchical
Equivalent Synonym
Not Related The relations do not have anything
to do with each other
Unknown The evaluator does not recognize
the meaning of the tag(s)
Table: Classification Labels for the User Evaluation.
Christoph Trattner & Christian K¨rner & Denis Helic (KMI, Navigability of Social Tagging Systems with Tag Taxonomies
o Enhancing the TU Graz) September 8, 2011 23 / 26
24. Evaluating Tag-Resource Taxonomies
The user study showed a high performance of our approach compared to a
Deg/Cooc tag taxonomy.
Name b Correct (%) Related (%) Equivalent (%) Not Related (%) Unknown(%)
Deg/Cooc10 10 33.2 27.3 13 21.9 5.1
Res10 10 27.3 36.2 12.3 19.8 4.2
Table: Results of the empirical analysis of the tag-resource taxonomy with
branching factor b = 10 compared to a Deg/Cooc tag taxonomy with branching
factor b = 10.
Christoph Trattner & Christian K¨rner & Denis Helic (KMI, Navigability of Social Tagging Systems with Tag Taxonomies
o Enhancing the TU Graz) September 8, 2011 24 / 26
25. Summary
We showed that tag taxonomies are in general not very well suited for
finding resources in an efficient number of clicks.
To tackle that issue we introduced a novel approach of the so-called
tag-resource hierarchies.
We illustrated in theory that with the approach of a tag-resource
taxonomy it is possible to navigate to resources efficiently.
Additionally to these findings, we introduced an algorithm to generate
such hierarchies and presented in a number of experiments that
proofed that tag-resource taxonomies perform on a semantic level
nearly as good or even better than popular tag taxonomy approaches.
Christoph Trattner & Christian K¨rner & Denis Helic (KMI, Navigability of Social Tagging Systems with Tag Taxonomies
o Enhancing the TU Graz) September 8, 2011 25 / 26
26. End of presentation
Thank you very much for your attention!
Christoph Trattner (ctrattner@iicm.edu)
Christoph Trattner & Christian K¨rner & Denis Helic (KMI, Navigability of Social Tagging Systems with Tag Taxonomies
o Enhancing the TU Graz) September 8, 2011 26 / 26