08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
Linked Open Data for Digital Humanities
1. Linked Open Data for
Digital Humanities
What is Linked Open Data and
why is it relevant for you ?
Christophe Guéret (@cgueret)
2. Open Data
“A piece of data or content is open if anyone
is free to use, reuse, and redistribute it —
subject only, at most, to the requirement to
attribute and/or share-alike.”
http://opendefinition.org/
3. Linked Data
"a term used to describe a recommended
best practice for exposing, sharing, and
connecting pieces of data, information, and
knowledge on the Semantic Web using URIs
and RDF."
http://linkeddata.org/
4. Linked Open Data
● Linked Open Data = Open Data + Linked
Data
● Interconnected data sets that are on the
Web and free to use
● 5-star scheme http://5stardata.info/
5. Why does it matter for DH ?
● Digital Humanities use a lot of data and
study relations between things
● Data acquisition & curation represents a
LOT of efforts for data consumers
● Linked Open Data is a good way to
○ Facilitate your own work (as a data consumer)
○ Facilitate other's work (as a data publisher)
6. Data found on the Web
● You get the following table as a CSV file
Kennis Stad
Christophe Amsterdam
David Parijs
● And that Excel table from somewhere else
Ville Pays
Paris France
Amsterdam Pays-Bas
7. And you want to integrate it
Kennis Stad Ville Pays
Christophe Amsterdam + Paris France =?
David Parijs Amsterdam Pays-Bas
● Data integration issues
○ Kennis, Stad, Ville, Pays ?
○ Parijs = Paris ?
○ Amsterdam = Amsterdam ?
● Lot of work for the (uninformed) consumer !
8. Linked Data approach
● Assign unique identifiers (URIs) to concepts
and things
● Create a "triple": connect the identifiers with
labelled, directed edges
dbo:country
dbpedia:Amsterdam dbpedia:Netherlands
9. Why does it solves the issue?
● Shift some of the data integration load on the
provider side
○ Clarify the semantics of the data
○ Refer to identifiers rather than names
● There is only one "dbpedia:Amsterdam" at
http://dbpedia.org/resource/Amsterdam
● Labels used for the edges are published by
an external authority
12. From triples to the Web of Data
● Every triple is a bit of factual information
● Because nodes are re-used across triples,
the union of all the triples is a graph
● The "Web of Data" is a pre-integrated,
semantically clear, data set ready to be
used!
14. Let's make a social network !
● The network
○ A node per European country
○ An edge means a shared official language
○ Label the edges with the languages
○ Label the nodes with the country names
● Data source
○ DBpedia SPARQL http://dbpedia.org/sparql
● Visualisation tool
○ Gephi https://gephi.org/
15. SPARQL ?
● Query language for Linked Open Data
● Describe part of the graph and use variables
dbo:country
dbpedia:Amsterdam ?Country
Suggested
book to read
17. Making the network
● Get the query from
○ https://gist.github.com/cgueret/5098706
● Copy & paste in to
○ http://dbpedia.org/sparql
● Change the result format to "CSV"
● Press "Run Query" and save the result
● Open Gephi
● Start a new project
● Import the CSV file in the "Data Laboratory"
20. Last words
● Look for data sources published as Linked
Open Data (RDF), this can save you time
● Consider publishing your own data as Linked
Open Data
● There is much more to say...
○ Using SPARQL within R (very easily)
■ http://linkedscience.org/tools/sparql-package-for-r/
○ Reasoning capabilities of triple stores
○ Creating and extending vocabularies