1. Motivation
Data on the Web
Some eyecatching opener illustrating growth and or diversity of web data
Linked Data and Education – Opportunities,
Challenges & the case of LinkedUp
Stefan Dietze
(L3S Research Center, DE,
@stefandietze,
http://purl.org/dietze)
Stefan Dietze
18/11/13
2. Once upon a time (just a short while ago in fact)
?
„blurb…
„blurb…
Berlin ...main
Tiergarten …
station…
Bahnhof…“
„blurb…
Berlin
central…“
HTML pages
Stefan Dietze
„…waiting @
#berlinhbf“
Social Data
„…Lehrter
Bahnhof…“
PDFs
18/11/13
3. “A little semantics goes a long way” (J.
1)
Hendler
Semantic Web
dbp:populatedPlace
Adding meaning through
shared vocabularies and
schemas (eg DBpedia)
typeOf
dbp:Berlin
typeOf
city
W3C standards RDF &
SPARQL for data &
knowledge representation
and querying
Persistent URIs to reference
& interlink data on the Web
dbp:Tiergarten
location
dbp:Berlin_Hauptbahnhof
redirectOf
dbp:Berlin_Central_Station
„blurb…
„blurb…
Berlin ...main
Tiergarten …
station…
Bahnhof…“
„blurb…
Berlin
central…“
HTML pages
1 Hendler,
redirectOf
dbp:Lehrter_Bahnhof
„…waiting @
#berlinhbf“
Social Data
J., The Dark Side of the Semantic Web, IEEE Intelligent Systems, Jan/Feb 2007
„…Lehrter
Bahnhof…“
PDFs
4. Semantic Web / Linked Data
Use of URIs, RDF and SPARQL for exposing data
De-facto standard for sharing data on the Web
rNews
Vision: well connected graph of open Web data
350+ datasets and 32 billion triples in LOD Cloud
alone
Media
Ontology
Geo
Ontology
Other „incarnations“:
Google
Knowledge Graph
Facebook Open Graph
Dublin
Core
DBpedia
Ontology
http://schema.org
FOAF
FMA
Ontology
BIBO
Gene
Ontology
Source: http://lod-cloud.net/state, September 2011
5. Linked Data for Education – How is it useful?
1. Linked Data as body of knowledge for education
vast amount of publicly available resources and data (300+ datasets, 32 billion statements LOD alone)
Dedicated OER and university data + „knowledge resources“ (from DBpedia to Slideshare)
2. Linked Data as set of principles and W3C standards for data sharing
RDF, SPARQL & shared vocabularies to improve interoperability of educational data
Supports Open Education Resources (OER) vision: reuse across isolated platforms
„HTTP-accessibility“
(SPARQL, URI-dereferencing)
http://linkeduniversities.org
„Structure“ & „Semantics“
(=> shared/linked vocabularies)
http://linkededucation.org
„Interlinked“
„Persistent“
Interlinking educational Resources and the Web of Data – a
Survey of Challenges and Approaches
Stefan Dietze, Salvador Sanchez-Alonso, Hannes Ebner, Hong Qing
Yu, Daniela Giordano, Ivana Marenzi, Bernardo Pereira Nunes,
Emerald Program: electronic Library and Information Systems,
Volume 47, Issue 1 (2013).
Linked Data for Open and Distance Learning
Mathieu d’Aquin, report for the Common Wealth18/11/13
of Learning,
Stefan Dietze
6. How LD principles can be useful for data sharing
LD as background knowledge
http://dbpedia.org/resources/Berlin
„HTTP-accessibility“
(SPARQL, URI-dereferencing)
„Structure“ & „Semantics“
(=> shared/linked vocabularies)
„Interlinked“
„Persistent“
Trusted knowledge,
exposed via
established standards
Shared semantics
(enrichment,
disambiguation)
Stefan Dietze
18/11/13
7. How LD principles can be useful for data sharing
LD as background knowledge
Combining a co-occurrence-based and a semantic measure
for entity linking, B. P. Nunes, S. Dietze, M.A. Casanova, R.
Kawase, B. Fetahu, and W. Nejdl., ESWC 2013 - 10th Extended
Semantic Web Conference, (May 2013).
Slideset
<sioc:Item 2139393292>
<title>Planetary motion
& gravity</title>
…
</sioc:Item 2139393292>
Semantics of terms?
Topics/categories addressed?
Relatedness of resources/entities?
(types, semantics)
Programme
Video
<po:Programme519215>
<po:Series>Wonders of the Solar
System</po:Series>
<po:Episode>Emp. of the Sun</po:Episode>
<po:Actor>Brian Cox</po:Actor>
</po:Programme519215 >
<yo:Video 8748720>
<dc:title>Pluto & the
Dwarf Planets</dc:title>
…
</yo:Video 8748720>
Stefan Dietze
18/11/13
8. How LD principles can be useful for data sharing
LD as background knowledge
Pluto?
Brian Cox?
Sun?
Programme
Video
<po:Programme519215>
<po:Series>Wonders of the Solar
System</po:Series>
<po:Episode>Emp. of the Sun</po:Episode>
<po:Actor>Brian Cox</po:Actor>
</po:Programme519215 >
<yo:Video 8748720>
<dc:title>Pluto & the
Dwarf Planets</dc:title>
…
</yo:Video 8748720>
Stefan Dietze
18/11/13
9. How LD principles can be useful for data sharing
LD as background knowledge
Slideset
db:Astronomy
<sioc:Item 2139393292>
<title>Planetary motion
& gravity</title>
…
</sioc:Item 2139393292>
db:Astronomical Objects
db:Pluto
(Dwarf Planet)
db:Sun
Programme
Video
<po:Programme519215>
<po:Series>Wonders of the Solar
System</po:Series>
<po:Episode>Emp. of the Sun</po:Episode>
<po:Actor>Brian Cox</po:Actor>
</po:Programme519215 >
<yo:Video 8748720>
<dc:title>Pluto & the
Dwarf Planets</dc:title>
…
</yo:Video 8748720>
Stefan Dietze
18/11/13
10. That’s awesome, but...
…why are there so few datasets actually used?
Hm,
really?
LD reuse and links very much focused on trusted „reference
graphs“ such as DBpedia
Long tail of LD datasets which are neither reused nor linked
to (LOD Cloud alone consists of 300+ datasets)
„HTTP-accessibility“
(SPARQL, URI-dereferencing)
Explanations?
„Structure“ & „Semantics“
(=> shared/linked vocabularies)
„Interlinked“
„Persistent“
Stefan Dietze
18/11/13
11. LD is more heterogeneous than we think
SPARQL Web-Querying Infrastructure:
Ready for Action?, Carlos Buil-Aranda, Aidan Hogan, Jürgen
Umbrich Pierre-Yves Vandenbussch, International Semantic Web
Conference 2013, (ISWC2013).
“Availability” & “Standards” ?
Less than 50% of all SPARQL endpoints actually responsive
at given point of time (“high reliability”)
“THE” SPARQL protocol? No, but many subsets/variants
Huge differences in response times
SPARQL endpoint availability over time [Buil-Aranda et al 2013]
Shared vocabularies & schemas, but:
…still very heterogeneous [d’Aquin, WebSci13]
…data partially messy an not conformant
(RDFS, schemas) [HoganJWS2012]
…even widely used reference datasets such as
DBpedia noisy [Fürber2010]
Co-occurence graph of data
types in 146 datasets: 144
Vocabularies, 588 highly
overlapping types, 719
Properties
Assessing the Educational Linked Data Landscape, D’Aquin, M.,
Adamou, A., Dietze, S., ACM Web Science 2013 (WebSci2013), Paris,
France, May 2013.
Using semantic web resources for data quality management. Fürber,
C., Hepp, M..2010,. In Proceedings of the 17th international conference on
Knowledge engineering and management by the masses (EKAW'10),
Springer-Verlag, Berlin, Heidelberg, 211-225.
An empirical survey of Linked Data conformance. Hogan, A., Umbrich,
J., Harth, A., Cyganiak, R., Polleres, A., Decker., S., In the Journal of Web
Semantics 14: pp. 14–44, 2012
12. (Linked) Open Data for Educationnutshell Using/exploiting Linked Data in Education ?
(Open) Educational Resources
Lack of reliable dataset metadata about
Resource types
Topics & disciplines
Quality, currentness & availability
Provenance
Lack of links
Distance Universities and cross-dataset references
Lack of federated query approaches
….
World
Wide
Web
Linked Open Data
MOOCs
http://linkededucation.org &
Stefan Dietze
http://linkeduniversities.org
18/11/13
13. “LinkedUp” – Linking Web Data for Education
L
European project aimed at
advancing take-up of open data
and related technologies
http://linkedup-project.eu
Success models:
data & applications
http://data.linkededucation.org
Data curation
Collecting & exposing open
data of educational relevance
=> LinkedUp Data Catalog
Profiling and linking of Web
Data for education
=> educational data graph
LinkedUp Challenge
to identify innovative
tools & applications
Evaluation methods
and approaches
http://www.linkedup-challenge.org/
Technology transfer
& community-building
Disseminating knowledge &
building communities
(educators, computer
scientists, data engineers)
Gathering stakeholder
feedback: use cases, and
requirements
http://linkedup-project.eu/events
Stefan Dietze
http://linkedup-challenge.org/#usecases
18/11/13
13
14. Who we areL
LinkedUp Advisory Board
LinkedUp Network
LinkedUp Consortium
17/09/2013
Stefan Dietze
18/11/13
14
15. “LinkedUp” – Linking Web Data for Education
L
European project aimed at
advancing take-up of open data
and related technologies
http://linkedup-project.eu
Success models:
data & applications
http://data.linkededucation.org
Data curation
Collecting & exposing open
data of educational relevance
=> LinkedUp Data Catalog
Profiling and linking of Web
Data for education
=> educational data graph
LinkedUp Challenge
to identify innovative
tools & applications
Evaluation methods
and approaches
http://www.linkedup-challenge.org/
Technology transfer
& community-building
Disseminating knowledge &
building communities
(educators, computer
scientists, data engineers)
Gathering stakeholder
feedback: use cases, and
requirements
http://linkedup-project.eu/events
Stefan Dietze
http://linkedup-challenge.org/#usecases
18/11/13
16. Data curation and dataset profiling
LinkedUp approach
Goal: helping data consumers to discover and use suitable datasets
Dataset selection: “LinkedUp/Linked Education cloud”
(http://datahub.io/groups/linked-education)
RDF (VoID) catalog of datasets (LinkedUp Catalog): classification of datasets
according to, eg, represented types, disciplines/topics, data quality,
accessability
Links and coreferences => unified view on data => Linked Education Graph
Infrastructure, unified (SPARQL) endpoint & APIs for federated querying
Automated processing to generate:
Descriptive VoID/RDF Dataset Catalog
Data links
LinkedUp
Catalog
Educational Datasets
Stefan Dietze
18/11/13
17. LinkedUp Data Catalog
in VoIDnutshell browse, explore and query for
a dataset catalog:
http://data.linkededucation.org/linkedup/catalog/
http://datahub.io/group/linked-education
datasets/types
Federated queries using type mappings
Stefan Dietze
18/11/13
18. What‘s all the data about: dataset profiling
Issue:
Considering LOD as knowledge graph, most
nodes are connected
Slideset
db:Astronomy
<sioc:Item 2139393292>
<title>Planetary motion
& gravity</title>
…
</sioc:Item 2139393292>
Relevance of topics (DBpedia entities &
categories) for particular resources and
datasets?
„Topic profile“ of a given dataset?
db:Astronomical Objects
db:Sun
Programme
db:Pluto
(Dwarf
Planet)
Video
<po:Programme519215>
<po:Series>Wonders of the Solar
System</po:Series>
<po:Episode>Emp. of the Sun</po:Episode>
<po:Actor>Brian Cox</po:Actor>
</po:Programme519215 >
<yo:Video 8748720>
<dc:title>Pluto & the
Dwarf Planets</dc:title>
…
</yo:Video 8748720>
Stefan Dietze
18/11/13
19. What‘s all the data about: dataset profiling
Generating structured Profiles of Linked Data
Graphs, Fetahu, B; Adamou, A., Dietze, S., d’Aquin,
M., Nunes, B.P., ISWC2013 – 12th International
Semantic Web Conference;
Goal: extracting representative „topic profile“ for datasets
How: computing of normalised (DBpedia) category relevance scores from sample resource sets
(scalability vs representativeness)
Applied to entire LOD cloud
db:Astronomy
DBpedia category graph
db:Astronomical Objects
db:Sun
Programme
<po:Programme519215>
<po:Series>Wonders of the Solar
System</po:Series>
<po:Episode>Emp. of the Sun</po:Episode>
<po:Actor>Brian Cox</po:Actor>
</po:Programme519215 >
Stefan Dietze
18/11/13
21. “LinkedUp” – Linking Web Data for Education
L
European project aimed at
advancing take-up of open data
and related technologies
http://linkedup-project.eu
Success models:
data & applications
Data curation
Collecting & exposing open
data of educational relevance
=> LinkedUp Data Catalog
Profiling and linking of Web
Data for education
=> educational data graph
LinkedUp Challenge
to identify innovative
tools & applications
Evaluation methods
and approaches
http://www.linkedup-challenge.org/
Series of 3 competitions („Veni“, „Vidi“,
„Vici“) running until end of 2014
Disseminating knowledge &
Open & focused tracks
building communities
Total prize budget of almost 40.000 EUR
(educators, computer
Technology engineers)
scientists, data transfer
LinkedUp support activities
& community-building
http://www.linkedup-challenge.org/
Gathering stakeholder
feedback: use cases, and
requirements
Stefan Dietze
18/11/13
22. Veni Competition
Tools and demos that analyse or integrate open web data
(deadline: 27 June, 1 Open Track, 10.000 EUR awards)
22 submissions, shortlist of 8, from which:
3 winners
People's Choice Award
Final ceremony on 17 September at OKCon, Geneva
17 September 2013, Geneva
Stefan Dietze
18/11/13
24. st
1
Place: PoliMedia
Exploring political debates & events
Cross-media analysis of political events.
Browsing parliament debates & related media
coverage
http://www.polimedia.nl/
Automatically generated links between transcripts
debates, newspaper articles, including their
original lay-out on the page, and radio bulletins.
Generated data available as Linked Data
(http://data.polimedia.nl)
Data sources: 1) newspapers in their original layout
of the historical newspaper archive, and 2) radio
bulletins of the Dutch National Press Agency (ANP)
9000+ debates (1945 – 1995)
Over 3000 media links
Martijn Kleppe, Max Kemman, Henri Beunders (Erasmus Universiteit
Rotterdam), Laura Hollink Damir Juric (Vrije Universiteit Amsterdam), Johan
Oomen Jaap Dietze (Nederlands Instituut voor Beeld en Geluid)
Stefan Blom
09/04/13
25. Outlook
LinkedUp Veni Competition
Wanted: tools and demos that analyse or integrate open web data (for education)
Anyone can participate - researchers, students, developers, industry
“Open track” & “focused tracks”
20.000+ EUR worth of awards
Final awards ceremony at 11th Extended
Semantic Web Conference (ESWC2014)
http://linkedup-challenge.org/
Submission: 14 February 2014
Learning Analytics & Knowledge (LAK) Data Challenge
Analyse, apply, use, exploit the „LAK Dataset“
Finals at Learning Analytics & Knowledge Conference 2014, Indianapolis, US
http://lak.linkededucation.org/
Submission: 20th January
18/11/13
25
26. Thank you!
Contact
http://purl.org/dietze | @stefandietze
See also (data)
http://datahub.io/group/linked-education
http://data.linkededucation.org
http://data.linkededucation.org/linkedup/catalog/
http://lak.linkededucation.org
See also (general)
http://linkedup-project.eu
http://linkedup-challenge.org
http://linkededucation.org
http://linkeduniversities.org
Stefan Dietze
18/11/13