2. KOM – Multimedia Communications Lab 2Source: www.icon-finder.com, www.flickr.com, www.delicious.com, www.crokodil.de
Definition of folksonomy, adapted from
[HJSS06]
Users 𝑈
Resources 𝑅
Tags 𝑇
Tag assignment relation 𝑇𝐴𝑆 ⊆ 𝑈 × 𝑅 × 𝑇
Folksonomies
Bob sugar loaf
A tag assignment
3. KOM – Multimedia Communications Lab 3
Task of Ranking of Resources: “Rank resources, such that they are in
descending order of relevance towards an information need.”
user
given as query-entity
Interests
match
More
like this
resource
adapted from [Bog09]
Guided
search
tag
Find me a
resource
Ranking Resources in Folksonomies
4. KOM – Multimedia Communications Lab 4
“How probable do I go to B being at A”
1/5
1/4
3/5
1/3
1/2
FolkRank [HJSS06] state-of-the-art graph-based
Based on PageRank’s random surfer [PBMW99]
|𝑅 𝑢, 𝑡 |
How to Actually Rank in Folksonomies?
Restart
3
1
1
2
1
1/4
3/4
1/4 2/3
1/5
Ranking 𝒗
describes context
45%
29%
16%
10%
α = 1/3α
Estimates relevance
fcbarcelona.com
messi
barca
barcaFan
Estimates authority
5. KOM – Multimedia Communications Lab 5
Assumption about folksonomy-structure violated
Source: www.icon-finder.com
Challenges of FolkRank
Concept drift
Ambiguity
Multi-facetedness of entities
Including quality attributes of a resource
Authority Signals (e.g. PageRank on the Web)
Hub signals
authority signals
hub signals
1
1
AI
(topic)
Barcelona
(location)
?
IJCAI-Proceedings.pdf
Artificial
Intelligence
(topic)
1
1
1
?1
football 1
1
?1
soccer
news
football
7. KOM – Multimedia Communications Lab 7
Proposed Approaches
IncentiveScore
Concept drift
Concept drift
InteliScore
Inclusion of quality attributes of resources
HITSonomy
Extensive description of resources/query-entity
VSScore
8. KOM – Multimedia Communications Lab 8
HITSonomy
FolkRank ‘thinks’ unidirectional
Combined scores yield ranking 𝒗
Estimates relevance & authority
Estimates relevance & hub
A B
21 2
A B
1/3 2/3 2/4
2/4
A B
1/3
2/3
2/42/4
“How probable do I go to B being at A”
A B
21 2
Additionally:
“How probable did I come from B being at A”
HITSonomy ‘thinks’ bidirectional
Inspired by HITS [Kle99]
Describes context
9. KOM – Multimedia Communications Lab 9
VSScore
Idea
Port ranking task to vector space model [MRS08] used in text retrieval
Cowboys
1
…
0
0.8
…
0.3
…
0.2
barca
barcaFan
dallascowboys.com
A term (usually) represents a semantic concept
Problem
No content information of resources (in this work)
Solution
Entities in folksonomy can be viewed as semantic concepts
Represent resources’ content by their context
Represent any entity by their context (e.g. a query-entity)
δ
Barcelona
Cowboys
Barcelona…
Messi...
Barcelona…
Barcelona…
FCB…
2
…
0
0
…
3
Cowboys
Barcelona
Dallas…
Cowboys…
Football…
Cowboys…
Dallas…
11. KOM – Multimedia Communications Lab 11
Evaluation Setup
BibSonomy corpus
Methodologies
LeavePostOut [JMH+07]
LeaveRTOut
Assumption: “Tag assignment indicates
relevance of resource towards information need
represented by user or tag”
Post: All tag assignments
between user and resource
RT: All tag assignments
between tag and resource
12. KOM – Multimedia Communications Lab 12
Evaluation Parameters
FolkRank
LeavePostOut, given user as query-entity find me resources
Restart propability
13. KOM – Multimedia Communications Lab 13
Evaluation Parameters
HITSonomy
LeavePostOut, given user as query-entity find me resources
Restart propability
Weighted arithmetic mean of authority and hub score
14. KOM – Multimedia Communications Lab 14
Evaluation Results
LeavePostOut: 1 out
Given user as query-entity find me resources
HITSonomy and VSScore significantly more effective than FolkRank
Wilcoxon signed rank test on AveragePrecision
15. KOM – Multimedia Communications Lab 15
Evaluation Results
LeavePostOut: 33% out
Given user as query-entity find me resources
HITSonomy and VSScore significantly more effective than FolkRank
Wilcoxon signed rank test on AveragePrecision
16. KOM – Multimedia Communications Lab 16
Conclusion
HITSonomy and VSScore can beat the state-of-the-art
In different resource ranking tasks
Depending on LeavePostOut/LeaveRTOut, thus the conditions of the query-entity
Other proposed algorithms not as well
Methodology Interests match Guided search
LeavePostOut HITSonomy HITSonomy
LeaveNPostsOut HITSonomy HITSonomy
LeaveRTOut FolkRank,
HITSonomy,
IncentiveScore,
InteliScore
VSScore
LeaveNRTsOut FolkRank,
HITSonomy,
IncentiveScore
HITSonomy,
VSScore
Most pairwise statistical significance comparisons won:
17. KOM – Multimedia Communications Lab 17
Contributions
Disambiguation algorithms not evaluated
Tag Assignment Context
Post Context
Taxonomy for graph-based scoring/ranking algorithms
Implemented and evaluated
Presented algorithms for ranking in folksonomies
AInheritScore and Ascore for ranking in by activities extended folksonomies
Various other ideas for ranking described
Tag type labeling of evaluation corpus
Analysis for CROKODIL application scenario
Graph-based ranking framework
18. KOM – Multimedia Communications Lab 18
Future Work
Parameterization of proposed algorithms
Ranking task
Evaluation
Creation of corpora
Efficient computation
Explainability
Preprocessing of folksonomy corpus
…
E.g. VSScore using HITSonomy result as context-description
19. KOM – Multimedia Communications Lab 19
Bibliography
[Bog09] T. Bogers. Recommender Systems for Social Bookmarking. PhD Thesis, Tilburg University,
2009.
[BSB+08] D. Böhnstedt, P. Scholl, B. Benz, C. Rensing, R. Steinmetz, and B. Schmitz. Einsatz
persönlicher Wissensnetze im Ressourcen-basierten Lernen. In Proceedings of the 6th
e-Learning Fachtagung Informatik, pages 113–124, 2008.
[HJSS06] A. Hotho, R. Jäschke, C. Schmitz, and G. Stumme. Information Retrieval in Folksonomies:
Search and Ranking. In Proceedings of the 3rd European Semantic Web Conference on the
Semantic Web: Research and Applications, pages 411–426, 2006.[JMH+07] Robert
Jäschke, Leandro Marinho, Andreas Hotho, Schmidt-Thie Lars, and Stum Gerd. Tag
recommendations in folksonomies. 2007
[MRS08] C. Manning, P. Raghavan, and H. Schütze. Introduction to Information Retrieval. Cambridge
University Press, 2008.
[Kle99] J. Kleinberg. Authoritative Sources in a Hyperlinked Environment. Journal of the ACM,
46:604–632, 1999.
[PBMW99] L. Page, S. Brin, R. Motwani, and T. Winograd. The PageRank Citation Ranking: Bringing
Order to the Web. Technical Report 1999-66, Stanford InfoLab, 1999.
Notas del editor
|
Name CROKODIL as the application scenario in which this thesis has been done
Explain information need and relevance briefly
Explain Graph creation briefly
Example
Explain how relevance and authority are determined
Give principle idea on IncentiveScore und InteliScore
Explain LeavePostOut, LeaveRTOut and give brief example for the different ranking tasks (interests match, guided search)
Recall biased jump from example
Recall biased jump from example
How to combine authority&relevance and hub&relevance score?
Explain vioplot
AveragePrecisions not normally distributed -> no t.test
About a 1/3 of resources thus removed from user
E.g. VSSCore with HITS ranking as context description or VSScore with context described in external corpus
Ranking for tag recommendation e.g.
Evaluation in CROKODIL scenario to determine true utility for activities (learning task)
CROKODIL corpus would be great to have true assessment of tag types as manual labeling is cumbersome
Efficient computation is usually important for creation of ranking: VSScore is slow or has to be stored
Scrutability can be desirable