4.16.24 21st Century Movements for Black Lives.pptx
Web Science Synergies: Exploring Web Knowledge through the Semantic Web
1. Exploring Web Data & Knowledge through
the Semantic Web
Dr. Stefan Dietze
L3S Research Center
Stefan Dietze
27/11/13
1
2. Pluto & the seven Dwarfs?
pluto the dwarf planet ?
„…solar
system…
#pluto“
Stefan Dietze
27/11/13
3. “A little semantics goes a long way” (J.
1)
Hendler
yago:AstronomicalObjects
Semantic Web
Adding meaning through
shared vocabularies and
schemas (eg DBpedia)
W3C standards RDF &
SPARQL for data &
knowledge representation
and querying
Persistent URIs to reference
& interlink data on the Web
dbp:CelestialBody
typeOf
typeOf
dbp:Pluto
dwarfPlanetOf
redirectOf
dbp:SolarSystem
namedAfter
dbp:Pluto(mythology)
dbp:DwarfPlanetPluto
„…solar
system…
#pluto“
1 Hendler,
J., The Dark Side of the Semantic Web, IEEE Intelligent Systems, Jan/Feb 2007
4. Semantic Web / Linked Data
De-facto standard for sharing data on the Web
Vision: well connected graph of open Web data
350+ datasets and 32 billion triples in LOD Cloud alone
Other „incarnations“:
Google
„HTTP-accessibility“
(SPARQL, URI-dereferencing)
Knowledge Graph
Facebook Open Graph
„Structure“ & „Semantics“
(=> shared/linked vocabularies)
http://schema.org
BBC
Program
mes
„Interlinked“
„Persistent“
FOAF
DBpedia
Ontology
Geo
Ontology
Gene
Ontology
Stefan Dietze
Dublin
Core
BIBO
5. That’s awesome, but...
Hm,
really?
…why are there so few datasets actually used?
Date reuse and in-links focused on trusted „reference
graphs“ such as DBpedia (i.e. Wikipedia)
Long tail of LD datasets which are neither reused nor linked
to (LOD Cloud alone consists of 300+ datasets)
„HTTP-accessibility“
(SPARQL, URI-dereferencing)
Explanations?
„Structure“ & „Semantics“
(=> shared/linked vocabularies)
„Interlinked“
„Persistent“
Stefan Dietze
27/11/13
6. Open data is more diverse than we think
SPARQL Web-Querying Infrastructure: Ready for Action?,
Carlos Buil-Aranda, Aidan Hogan, Jürgen Umbrich Pierre-Yves
Vandenbussch, International Semantic Web Conference 2013,
(ISWC2013).
Accessibility of datasets?
Less than 50% of all SPARQL endpoints actually responsive
at given point of time
“THE” SPARQL protocol? No, but many variants & subsets
…
SPARQL endpoint availability over time [Buil-Aranda et al 2013]
Shared vocabularies & schemas, but:
…still very heterogeneous [d’Aquin, WebSci13]
…data partially messy an not conformant
(RDFS, schemas) [HoganJWS2012]
Co-occurence graph of data
types in 146 datasets: 144
Vocabularies, 588 highly
overlapping types, 719
Properties
…even widely used reference datasets such as
DBpedia noisy [Paulheim2013]
Assessing the Educational Linked Data Landscape, D’Aquin, M.,
Adamou, A., Dietze, S., ACM Web Science 2013 (WebSci2013), Paris,
France, May 2013.
Type Inference on Noisy RDF Data, Paulheim H., Bizer, C. Semantic
Web – ISWC 2013, Lecture Notes in Computer Science Volume 8218,
2013, pp 510-525
Stefan Dietze
An empirical survey of Linked Data conformance. Hogan, A., Umbrich,
J., Harth, A., Cyganiak, R., Polleres, A., Decker., S., In the Journal of Web
Semantics 14: pp. 14–44, 2012
7. Too many/diverse datasets, too little information
Which datasets are useful & trustworthy for case
XY (eg „learning about the solar system“) ?
Which topics (eg „Astronomy“) are covered by
dataset X?
Which datasets describe/offer videos (slides,
publications, statistics etc)?
?
?
?
Stefan Dietze
27/11/13
8. Data curation and dataset profiling
Which datasets are useful & trustworthy for case
XY (eg „learning about the solar system“) ?
Which topics (eg „Astronomy“) are covered by
dataset X?
Which datasets describe/offer videos (slides,
publications, statistics etc)?
Catalog of data (LinkedUp
Catalog): classification of datasets
according to resource types,
disciplines/topics, data quality,
accessability, etc
Infrastructure for
distributed/federated querying
describes
Stefan Dietze
LinkedUp
Dataset Catalog
27/11/13
9. Dataset profiling: what’s all the data about
po:Programme
AAISO
BBC Programme
bibo:Fi
bibo:Film
bibo:Fil BIBO FOAF
<po:Programme …>
<po:Series>Wonders of the Solar System</.>
<po:Actor>Brian Cox</…>
</po:Programme…>
Schema mappings
yov:Video
contains
Yovisto Video
<yo:Video …>
<dc:title>Pluto & the
Dwarf Planets</dc:title>
…
</yo:Video…>
Entity disambiguation
db:Astro. Objects
db:Astro. Objects
db:Astronomy
Topic profile extraction
Dataset
Metadata
Stefan Dietze
LinkedUp
Dataset Catalog
27/11/13
10. LinkedUp Data Catalog
inExplore & query for datasets/types & topics
a nutshell
http://data.linkededucation.org/linkedup/categories-explorer
http://data.linkededucation.org/linkedup/catalog/
Federated queries using type mappings
Stefan Dietze
27/11/13
11. LinkedUp Challenge: using open data for learning
http://linkedup-challenge.org
Open Data Competition to promote tools and applications that analyse / integrate (Linked)
Web data
Organised by LinkedUp project over 2 years (“Veni”, “Vidi”, “Vici”) with 40.000 EUR awards
Veni Competition - 22 submissions, 8 shortlisted for presentation at Open Knowledge
Conference (17 September, Geneva Switzerland)
Stefan Dietze
27/11/13
12. st
1
Place: PoliMedia
Exploring political debates & events
http://www.polimedia.nl/
Cross-media exploration & analysis of political
events
(parliament debates and media coverage)
Automatically generated links between transcripts
debates, newspaper articles, and radio bulletins.
(Linked) Data available at http://data.polimedia.nl
Data sources: 1) newspapers of the historical
newspaper archive, 2) radio bulletins of the Dutch
National Press Agency (ANP)
9000+ debates (1945 – 1995)
Over 3000 media links
Martijn Kleppe, Max Kemman, Henri Beunders (Erasmus
Universiteit Rotterdam), Laura Hollink Damir Juric (Vrije
Universiteit Amsterdam), Johan Oomen Jaap Blom
(Nederlands Instituut voor Beeld en Geluid)
Stefan Dietze
27/11/13
13. Outlook: more “focused” data reuse challenges
http://linkedup-challenge.org/
Open Track
Focused Track
Scalable tools and applications
using (Linked) open data for
educational purposes
LinkedUp data catalog
Promotion of selected Veni
submissions
Simplifying complex
information to make it
accessible (example:
publications from Elsevier)
Recommender system for
educational resources (courses,
MOOCs) relevant to user
interests
Approx. 20.000 EUR awards budget
Final events at 11th Extended Semantic Web Conference (ESWC2014)
Submission: 14 February 2014
Stefan Dietze
27/11/13
13
14. Thank you!
REFERENCES
WWW
Assessing the Educational Linked Data Landscape, D’Aquin, M., Adamou,
A., Dietze, S., ACM Web Science 2013 (WebSci2013), Paris, France, May
2013.
See also (data)
Generating structured Profiles of Linked Data Graphs, Fetahu, B; Dietze,
S., d’Aquin, M., Nunes, B.P., ISWC2013 – 12th International Semantic Web
Conference;
http://datahub.io/group/linked-education
http://data.linkededucation.org
http://data.linkededucation.org/linkedup/catalog/
http://lak.linkededucation.org
Combining a co-occurrence-based and a semantic measure for entity
linking, B. P. Nunes, S. Dietze, M.A. Casanova, R. Kawase, B. Fetahu, and
W. Nejdl., ESWC 2013 - 10th Extended Semantic Web Conference, (May
2013).
See also (general)
Type Inference on Noisy RDF Data, Paulheim H., Bizer, C. Semantic Web –
ISWC 2013, Lecture Notes in Computer Science Volume 8218, 2013, pp
510-525
An empirical survey of Linked Data conformance. Hogan, A., Umbrich, J.,
Harth, A., Cyganiak, R., Polleres, A., Decker., S., In the Journal of Web
Semantics 14: pp. 14–44, 2012
http://linkedup-project.eu
http://linkedup-challenge.org
http://linkededucation.org
http://linkeduniversities.org
SPARQL Web-Querying Infrastructure: Ready for Action?, Carlos BuilAranda, Aidan Hogan, Jürgen Umbrich Pierre-Yves Vandenbussch,
International Semantic Web Conference 2013, (ISWC2013).
Stefan Dietze
27/11/13
14