Presentation from mentoring event of Open Education Europa Challenge (http://www.openeducationchallenge.eu/) about using Linked Data in educational applications.
1. Exploiting (Linked) Web Data in Educational Applications
Stefan Dietze L3S Research Center http://purl.org/dietze @stefandietze - Open Education Challenge, Berlin, 2014 -
28/10/14
1
Stefan Dietze
2. Linked Data for education
Data sharing: TED, Open Courseware, mEducator, LinkedUp,
LAK….
Tutorials & workshops (eg „Linked Learning“ series)
LinkedUniversities.org and LinkedEducation.org
W3C Linked Open Education community group
Research areas
Web & data science, information retrieval, semantic web &
Linked Data, data & knowledge integration
Application domains: education/TEL, Web archiving, …
Some projects
Introduction
http://www.l3s.de/
28/10/14 2
See also: http://purl.org/dietze
Stefan Dietze
3. Social
Media
Exploiting Open Data for Education?nutshell
(Open) Educational Resources
World Wide Web
Distance Universities
MOOCs
Linked Open Data
28/10/14
3
Stefan Dietze
4. How Open is Open Data?
Open Data (as in “open licensing”)
Open licensing (ODL, CC etc)
Yet: variety of approaches
APIs/feeds: SOAP, REST, etc
Diverse schemas & vocabularies
(lack of) controlled vocabularies
Reuse & interoperability?
Linked Data (technology) (as in “interoperability”)
Defacto Standard for Open Data on the Web
W3C standards:
Common HTTP interface: SPARQL
Common representation: RDF
Dereferencable URIs
Shared/linked vocabularies
Linked Open Data
5-star scheme by Sir Tim Berners Lee
28/10/14
4
Stefan Dietze
5. Semantic Web
Example: Google Knowledge Graph (DBpedia, Freebase, Yago etc)
W3C standards (RDF & SPARQL) for knowledge representation and querying
URIs to identify/link data
“A little semantics goes a long way” (J. Hendler1)
dbp:United_States
http://dbpedia.org/resource/Cambridge_MA
dbp:W3C
country
cityOf
1 Hendler, J., The Dark Side of the Semantic Web, IEEE Intelligent Systems, Jan/Feb 2007
schema:City
typeOf
dbp:MIT
ru.dbp:Кембридж_(Массачусетс)
sameAs
headquarterOf
6. HTTP accessibility: persistent URIs, SPARQL
FOAF
Gene Ontology
BIBO
Geo Ontology
DBpedia Ontology
Dublin Core
BBC Programmes
Connected graph of open Web data (500+ datasets and 100 billion triples)
Persistent, dereferencable URIs & content negotiation, shared/linked vocabularies
SPARQL to query via HTTP
Other „incarnations“:
Google Knowledge Graph
Facebook Open Graph
http://schema.org
http://dbpedia.org/resource/Cambridge_MA
28/10/14
6
Stefan Dietze
8. Other learning-relevant data & resources
Publications & literature
(Social) media resource metadata
Domain-specific knowledge: Bioportal, Europeana, Geonames, …
Cross-domain factual knowledge: DBpedia, Freebase, …
LD as body of knowledge for education
http://linkededucation.org
http://linkeduniversities.org
28/10/14
8
Stefan Dietze
Educational datasets and vocabularies
University Linked Data: The Open University UK, http://data.open.ac.uk, Southampton University, http://education.data.gov.uk, …
Open Educational Resources metadata: mEducator, Open Learn, Open Courseware, …
Schemas: Learning Resource Metadata Initiative (LRMI, mEducator Educational Resources schema, BIBO, AAISO, …
9. LD as background knowledge for educational apps?
http://metamorphosis.med.duth.gr/
Title: ECG Patient case 1001 chest and limb leads
28/10/14
9
Stefan Dietze
10. Title: ECG Patient case 1001 chest and limb leads
„ECG“ dismabiguation on Wikipedia: 9 meanings
LD as background knowledge for educational apps?
28/10/14
10
Stefan Dietze
11. dbpedia.org/resource/Electrocardiagraphy
1. Understanding data: contextual disambiguation through NLP tools
2. Enrichment with factual knowledge
dbpedia:Электрокардиография
category:Cardiac_procedures
dbpedia:Willem_Einthoven
3. interlinking with related resources
bbc:ProgrammeXY
slideshare:SlidesetXY
yovisto:VideolectureXY
Title: ECG Patient case 1001 chest and limb leads
Understanding, enriching, linking data
28/10/14
11
Stefan Dietze
12. „Success models“: data & applications
Supporting innovative tools & applications
Evaluation methods
LinkedUp – Linking Web Data for Education
Technology transfer & community-building
Involving educators, developers, computer scientists, data engineers…
http://www.linkedup-challenge.org/
Data curation & profiling
Collecting & exposing open data for education
Profiling of Web Data
http://data.linkededucation.org
EC-funded project aimed at advancing take-up of open data and related technologies
http://www.linkedup-project.eu/events
28/10/14
Stefan Dietze
12
http://www.linkedup-project.eu/
13. Community-building and collaboration Joint work on tangible outcomes (datasets, applications....)
Associated Partners
Initiatives
EC Projects
Stefan Dietze
14. Collected & curated datasets of educational relevance
Beyond collecting: published over 50 datasets as LD together with most important content providers e.g. TED, OCW, SoLAR etc
LinkedUp catalog: most comprehensive collection of LD/Open Data for education
RDF dataset metadata
Federated queries across datasets using type mappings
Publishing & curating educational data
http://data.linkededucation.org/linkedup/catalog/
28/10/14
Stefan Dietze
14
15. http://data-observatory.org/lod-explorer
Supporting developers and data consumers
Devtalk blog: developer resource & community to aid developers
Webinars and tutorials
http://data.linkededucation.org/linkedup/devtalk/
Topic-based annotation and discovery of data
Data exploration & visualisation features
28/10/14
Stefan Dietze
16
16. LinkedUp events, training & technology transfer Bringing stakeholders together
Data Providers & Data Scientists
Developers
Community-building through events & communication channels/social media (cross-disciplinary, industry & academia)
Exploitation of project outcomes across communities: technology transfer
(Co-)organised approx. 20 events (tutorials, workshops, booths etc)
More than 30 invited talks/lectures
….
Users (Learners, Tutors, Teachers)
28/10/14
Stefan Dietze
17
17. May –September 2013
October 2013 – May 2014
May 2014 – October 2014
Series of Open Data Competitions to promote applications which exploit Linked Open Data
http://www.linkedup-challenge.org/
LinkedUp Challenge
18. 23
14
13
8
9
10
0
5
10
15
20
25
Veni Vidi Vici
submissions
shortlist
LinkedUp Challenge results
50 submissions of which 27 were shortlisted
and supported (through travel grants,
participation in events and rewards)
13 Veni, Vidi, Vici winners
(grants: 1000 – 3000 €)
Authors from 23 distinct, mostly European
countries
LinkedUp submissions & shortlist
Coatia; 4
Greece; 4
Belgium; 5
Italy; 7
Germany; 11
Spain;
13
France; 14
Netherlands; 15
United States;
15
United
Kingdom; 21
authors
Top-10
author‘s
origins
Stefan Dietze 28/10/14 21
19. Issues (1/3) - open data is messier than we think
SPARQL endpoint availability over time [Buil-Aranda et al 2013]
Accessibility of datasets?
Less than 50% of all SPARQL endpoints actually responsive at given point of time [Buil-Aranda2013]
“THE” SPARQL protocol? No, but many variants & subsets
Data “quality”?
…data accuracy (eg DBpedia)? [Paulheim2013]
…vocabulary reuse/links? [D’AquinWebSci13]
…schema compliance (RDFS, schemas) [HoganJWS2012]
Stefan Dietze
SPARQL Web-Querying Infrastructure: Ready for Action?, Carlos Buil-Aranda, Aidan Hogan, Jürgen Umbrich Pierre-Yves Vandenbussch, International Semantic Web Conference 2013, (ISWC2013).
Assessing the Educational Linked Data Landscape, D’Aquin, M., Adamou, A., Dietze, S., ACM Web Science 2013 (WebSci2013), Paris, France, May 2013.
Type Inference on Noisy RDF Data, Paulheim H., Bizer, C. Semantic Web – ISWC 2013, Lecture Notes in Computer Science Volume 8218, 2013, pp 510-525
An empirical survey of Linked Data conformance. Hogan, A., Umbrich, J., Harth, A., Cyganiak, R., Polleres, A., Decker., S., Journal of Web Semantics 14, 2012
28/10/14
22
20. Issues (2/3) – accepting inconsistency
Analyzing Relative Incompleteness of Movie Descriptions in the Web of Data: A Case Study, Yuan, W., Demidova, E., Dietze, S., Zhu, X., International Semantic Web Conference 2014 (ISWC2014)
28/10/14
Stefan Dietze
23
21. Issues (3/3) – licensing/legal aspects
Dataset
Words
Pages
DBpedia
7163
16
Flickr
10367
23
ConceptNet
7163
16
World Bank
7056
16
Nature
7024
16
LinkedIn
6104
14
Google+
5740
13
Tumblr
5362
12
Twitter
4247
9
Facebook
4179
9
Mashing up data: legal and licensing related issues under-estimated
What license do you get when mashing up:
Attribution: copyright violation from missing (86%) or incorrect attribution (14%) information
Terms & conditions: complexity and conflicts when merging data from different sources
Potential non-compliance from evolution of (a) LOD applications and (b) underlying datasets (and their licenses)
T&C of established datasets
28/10/14
Stefan Dietze
24
Nature (CC0) + DBpedia (CC-ShareAlike) + FAO (Proprietary non-commercial) => ?