Ranking the Linked Data: the case of DBpedia - ICWE 2010
1. 10th International Conference on Web Engineering, Vienna
July 5-9, 2010
Ranking the LinkedData:
the case of DBpedia
Roberto Mirizzi1, Azzurra Ragone1,2,
Tommaso Di Noia1, Eugenio Di Sciascio1
1Politecnico di Bari
Via Orabona, 4
70125 Bari (ITALY)
2University of Trento
Via Sommarive, 14
38100 Trento (ITALY)
2. 10th International Conference on Web Engineering, Vienna
July 5-9, 2010
Outline
• Tags are all around
• NOT (Not Only Tag): what is it?
• NOT a look behind the curtains:
– Ranking of RDF resources: an hybrid approach
• Evaluation
• Conclusion and Future Work
5. 10th International Conference on Web Engineering, Vienna
July 5-9, 2010
Tagging: a double face
Annotation phase Retrieval phase
6. 10th International Conference on Web Engineering, Vienna
July 5-9, 2010
Problems with annotation
• Insert as much as possible tags (time
consuming):
– different versions of the same tag to catch all the
possible searches
– Multilingual tags
7. 10th International Conference on Web Engineering, Vienna
July 5-9, 2010
Problem with retrieval
• Exactly (syntactic) match among tags: web
service is different from web services,
webservices,…
8. 10th International Conference on Web Engineering, Vienna
July 5-9, 2010
Why not to use Semantic tags?
Plugged into the Web 3.0
Disambiguation
Relations among tags
Machine understandable
NOT: Not Only Tag
http://sisinflab.poliba.it/not-only-tag/
10. 10th International Conference on Web Engineering, Vienna
July 5-9, 2010
NOT
http://sisinflab.poliba.it/not-only-tag/
11. 10th International Conference on Web Engineering, Vienna
July 5-9, 2010
Smarter taggingAnnotationphaseRetrievalphase
12. 10th International Conference on Web Engineering, Vienna
July 5-9, 2010
What is behind NOT?
• DBpedia graph exploration
• Computation of similarity value between each
pair of RDF resources using external
information sources (search engines,
bookmarking systems)
17. 10th International Conference on Web Engineering, Vienna
July 5-9, 2010
Functional Architecture
Back-end
Query engine
Storage
Cloud
Generator
GUI
Ext.InfoSources
DBpedia
Lookup
Service
Delicious
Yahoo!
Bing
Graph
Explorer
SPARQL
Context
Analyzer
Ranker
Offline computation
Linked Data graph
exploration
Rank nodes exploiting
external information
Store results as pairs of
nodes together with their
similarity
Runtime Search
Start typing a tag
Query the system for
relevant tags
(corresponding to DBpedia
resources)
Show the semantic tag
cloud
1
2
3
1
2
3
1
Offlinecomputation
2
3
1
2
3
GoogleGoogle
Runtimesearch
18. 10th International Conference on Web Engineering, Vienna
July 5-9, 2010
Evaluation
We evaluate five different algorithms:
1. DBpediaRanker
2. DBpediaRanker minus Wikipedia info
3. DBpediaRanker minus ext info sources
4. Co-occurrence
5. Similarity Distance
),()()(
),(
),(
2121
21
21
rrfrfrf
rrf
rrcoOcc
)}(log),(min{loglog
),(log)(log),(logmax
),(
21
2121
21
rfrfN
rrfrfrf
rrngd
19. 10th International Conference on Web Engineering, Vienna
July 5-9, 2010
Evaluation (II)
http://sisinflab.poliba.it/evaluation
50 volunteers
Researchers in the ICT area
244 votes collected (on average 5
votes for each users)
Time to vote: 1min and 40secs
20. 10th International Conference on Web Engineering, Vienna
July 5-9, 2010
Evaluation (III)
http://sisinflab.poliba.it/evaluation/data
3.91 - Good
21. 10th International Conference on Web Engineering, Vienna
July 5-9, 2010
Conclusion
• NOT *is* useful in the annotation phase:
– suggestions of semantically related tags
– Tags enrichment
• NOT *is* useful in the retrieval phase:
– Semantic match among tags
23. 10th International Conference on Web Engineering, Vienna
July 5-9, 2010
Impakt Revolution
http://sisinflab.poliba.it/impakt-revolution/
24. 10th International Conference on Web Engineering, Vienna
July 5-9, 2010
Inspiration: Google Wonder Wheel
Exploratory Search in Google…
…nice, but there is no “semantics” in it.
You can not discover new knowledge exploiting the meaning of a term (keyword/tag/query)
25. 10th International Conference on Web Engineering, Vienna
July 5-9, 2010
SWOC: Semantic Wonder Cloud
http://sisinflab.poliba.it/semantic-wonder-cloud/index/
26. 10th International Conference on Web Engineering, Vienna
July 5-9, 2010
Q&A
a.ragone@poliba.it
Thanks for being here on Friday! :-)
http://sisinflab.poliba.it/not-only-tag/
http://sisinflab.poliba.it/semantic-wonder-cloud/index/
http://sisinflab.poliba.it/impakt-revolution/
27. 10th International Conference on Web Engineering, Vienna
July 5-9, 2010
Conclusion
NOT: a tool for smarter tagging
Ranking algorithm for RDF graphs
Future work
Test our algorithms with different domains
Extract more fine grained contexts
Enrich the extracted context using also relevant properties
Integrate our approach with real existing systems
Use the core system to automatically extract relevant tags
(concepts) from a document (or from a collection of
documents) exploiting tools for named entities extraction