Introduction to semantic web. The first results in publication of library data into the semantic web at the National Széchényi Libary (National Library of Hungary)
1. Semantic web: where are weSemantic web: where are we
now?now?
ADLUG Users Group MeetingADLUG Users Group Meeting
Bilbao, 16-18 September, 2009Bilbao, 16-18 September, 2009
ÁdámÁdám HorváthHorváth
NSZLNSZL
2. 2 Semantic web: where are we
ContentsContents
TutorialTutorial
Where are we now?Where are we now?
3. 3 Semantic web: where are we
Linked data on the webLinked data on the web
Use RDF data modelUse RDF data model
Use RDF linksUse RDF links
Web of Data or Semantic WebWeb of Data or Semantic Web
Linked Data browsersLinked Data browsers
Linked Data provides a single,Linked Data provides a single,
standardised access mechanismstandardised access mechanism
(compared to APIs)(compared to APIs)
4. 4 Semantic web: where are we
Advantage of Linked DataAdvantage of Linked Data
Linked Data provides a single,Linked Data provides a single,
standardised access mechanismstandardised access mechanism
– Easily crawlableEasily crawlable
– Accessible using generic data browserAccessible using generic data browserss
– Enables link between data from differentEnables link between data from different
data sourcesdata sources
5. 5 Semantic web: where are we
Web architecture : resourcesWeb architecture : resources
We have to identify the items of interestWe have to identify the items of interest
in our domainin our domain
All items of interest are calledAll items of interest are called
resourcesresources
– Information resources (everything in theInformation resources (everything in the
traditional web)traditional web)
– Non-information resources (real worldNon-information resources (real world
objects, things)objects, things)
6. 6 Semantic web: where are we
Web architecture : resourcesWeb architecture : resources
identifiersidentifiers
Uniform Resource Identifiers (URIs)Uniform Resource Identifiers (URIs)
HTTP URIHTTP URI
– Not URN, DOI and so onNot URN, DOI and so on
Why HTTP URIWhy HTTP URI
– Simple way of creating globally uniqueSimple way of creating globally unique
identifier without centralised managementidentifier without centralised management
– Can also be used for accessing humanCan also be used for accessing human
readable information (information on thereadable information (information on the
Web)Web)
7. 7 Semantic web: where are we
Web architecture : representationWeb architecture : representation
Information resources can haveInformation resources can have
representationsrepresentations
Representations are stream of bytesRepresentations are stream of bytes
A resource can have manyA resource can have many
representationsrepresentations
8. 8 Semantic web: where are we
Web architecture : dereferencingWeb architecture : dereferencing
HTTP URIHTTP URI
This is the process of looking up URI on theThis is the process of looking up URI on the
WEB to get information about the referencedWEB to get information about the referenced
resourceresource
Information resourcesInformation resources
– HTTP 200HTTP 200 „OK”„OK” response coderesponse code
Non-information resourcesNon-information resources
– HTTP 303HTTP 303 „„SeeSee OOtherther”” reference codereference code
– The client dereferences the new URI and getsThe client dereferences the new URI and gets
representation describing the original non-representation describing the original non-
information objectinformation object
9. 9 Semantic web: where are we
Web architecture : contentWeb architecture : content
negotiationnegotiation
For non-information resources it is aFor non-information resources it is a
good practice to create an HTMLgood practice to create an HTML
representation describing the thing forrepresentation describing the thing for
human being and an RDFhuman being and an RDF
representation for machinesrepresentation for machines
The different representations can beThe different representations can be
reached via the content negotiationreached via the content negotiation
12. 12 Semantic web: where are we
Web architecture : URI aliasesWeb architecture : URI aliases
Two data provider can assign differentTwo data provider can assign different
URI for the same resourceURI for the same resource
– http://dbpedia.org/resource/Berlinhttp://dbpedia.org/resource/Berlin
– http://sws.geonames.org/2950159/http://sws.geonames.org/2950159/
Information providers often setInformation providers often set
owl:sameAsowl:sameAs
link to URI aliases they know aboutlink to URI aliases they know about
13. 13 Semantic web: where are we
Web architecture : AssociatedWeb architecture : Associated
descriptionsdescriptions
RDF description of a non-informationRDF description of a non-information
object that a client obtains byobject that a client obtains by
dereferencing the URI of the non-dereferencing the URI of the non-
information objectinformation object
14. 14 Semantic web: where are we
The RDF data modelThe RDF data model
RDF (Resource Description Framework)RDF (Resource Description Framework)
is used to represent information aboutis used to represent information about
resourcesresources
A description of a resource isA description of a resource is
represented as a number of triplesrepresented as a number of triples
Triples contain subject, predicate,Triples contain subject, predicate,
objectobject
15. 15 Semantic web: where are we
The RDF data modelThe RDF data model
ExampleExample
– The bookThe book has the titlehas the title DekameronDekameron
– SubjectSubject PredicatePredicate ObjectObject
16. 16 Semantic web: where are we
The RDF data modelThe RDF data model
SubjectSubject
– URI identifying the described resourceURI identifying the described resource
ObjectObject
– Literal value (string, number, date, etc)Literal value (string, number, date, etc)
– URI that is related somehow to the subjectURI that is related somehow to the subject
PredicatePredicate
– Indicates the relationship between theIndicates the relationship between the
subject and objectsubject and object
– URI, and comes from vocabulariesURI, and comes from vocabularies
17. 17 Semantic web: where are we
The RDF data modelThe RDF data model
Two types of RDF triplesTwo types of RDF triples
– Literal RDFLiteral RDF
• Used to describe the properties of resourcesUsed to describe the properties of resources
– The title of a bookThe title of a book
– RDF linksRDF links
• Three URIThree URI
• Subject and Object identify the interlinkedSubject and Object identify the interlinked
resourcesresources
• Predicate tells the relationshipPredicate tells the relationship
18. 18 Semantic web: where are we
The RDF data modelThe RDF data model
ExampleExample 11
– The bookThe book has the titlehas the title DekameronDekameron
– SubjectSubject PredicatePredicate ObjectObject
Example 2Example 2
– SubjectSubject
• http://nektar.oszk.hu/resource/manifestation/2645471http://nektar.oszk.hu/resource/manifestation/2645471
– PredicatePredicate
• <dc:title><dc:title>
– ObjectObject
• DekameronDekameron
19.
20.
21.
22.
23. 23 Semantic web: where are we
The RDF data modelThe RDF data model
The most valuable RDF links are thoseThe most valuable RDF links are those
that connect the resource to externalthat connect the resource to external
data published by other data sourcesdata published by other data sources
24. 24 Semantic web: where are we
The RDF data modelThe RDF data model
Benefits of using the RDF data modelBenefits of using the RDF data model
– Clients can lookup every URI in the RDFClients can lookup every URI in the RDF
graph over the Web to retrieve additionalgraph over the Web to retrieve additional
informationinformation
– Information from different sources mergeInformation from different sources merge
naturallynaturally
– RDF links between data from differentRDF links between data from different
sources can be setsources can be set
– Information expressed in different schemaInformation expressed in different schema
can be represented in a single modelcan be represented in a single model
25. 25 Semantic web: where are we
Choosing URIsChoosing URIs
Good nameGood name
– ExpressiveExpressive
– Others can use confidentlyOthers can use confidently
Technical infrastructure to make themTechnical infrastructure to make them
dereferencabledereferencable
26. 26 Semantic web: where are we
Choosing URIsChoosing URIs
Good practice (Cool URIs)Good practice (Cool URIs)
– Use HTTP URIUse HTTP URI
• Not URN, DOINot URN, DOI
– Define URI in the namespace under yourDefine URI in the namespace under your
controlcontrol
27. 27 Semantic web: where are we
Choosing URIsChoosing URIs
Good practice (Cool URIs)Good practice (Cool URIs)
– Implementation specific things should notImplementation specific things should not
appearappear
– Compare:Compare:
http://link.oszk.hu/libriurl.php?http://link.oszk.hu/libriurl.php?
LN=hu&DB=OSZK&SRY=an&SRE=000002645471LN=hu&DB=OSZK&SRY=an&SRE=000002645471
http://nektar.oszk.hu/hu/manifestation/2645471http://nektar.oszk.hu/hu/manifestation/2645471
28. 28 Semantic web: where are we
Choosing URIsChoosing URIs
Good practice (Cool URIs)Good practice (Cool URIs)
– Try to keep stable and persistentTry to keep stable and persistent
29. 29 Semantic web: where are we
VocabulariesVocabularies
Well-known vocabulariesWell-known vocabularies
– Friend-of-a-Friend (FOAF), vocabulary for describing people.Friend-of-a-Friend (FOAF), vocabulary for describing people.
– Dublin Core (DC) defines general metadata attributes. See also theirDublin Core (DC) defines general metadata attributes. See also their
new domains and ranges draft.new domains and ranges draft.
– Semantically-Interlinked Online Communities (SIOC), vocabulary forSemantically-Interlinked Online Communities (SIOC), vocabulary for
representing online communities.representing online communities.
– Description of a Project (DOAP), vocabulary for describing projects.Description of a Project (DOAP), vocabulary for describing projects.
– Simple Knowledge Organization System (SKOS), vocabulary forSimple Knowledge Organization System (SKOS), vocabulary for
representing taxonomies and loosely structured knowledge.representing taxonomies and loosely structured knowledge.
– Music Ontology provides terms for describing artists, albums andMusic Ontology provides terms for describing artists, albums and
tracks.tracks.
– Review Vocabulary, vocabulary for representing reviews.Review Vocabulary, vocabulary for representing reviews.
– Creative Commons (CC), vocabulary for describing license terms.Creative Commons (CC), vocabulary for describing license terms.
30. 30 Semantic web: where are we
Serving information as Linked DataServing information as Linked Data
RequirementRequirement
– Things must be identified with dereferenceableThings must be identified with dereferenceable
HTTP URIsHTTP URIs
– Data source must return an RDF/XMLData source must return an RDF/XML
description of the identified resourcedescription of the identified resource
– Links to external resourcesLinks to external resources
– Links from external resources (ensure that thereLinks from external resources (ensure that there
are external RDF links pointing at URIs fromare external RDF links pointing at URIs from
your dataset)your dataset)
31. 31 Semantic web: where are we
Serving information as Linked DataServing information as Linked Data
How to create external links pointing to yourHow to create external links pointing to your
data?data?
– In your FOAF file point the central resources ofIn your FOAF file point the central resources of
your datasetyour dataset
• If one of your friends has a FOAF file and points toIf one of your friends has a FOAF file and points to
you your dataset is now part of the Web of Datayou your dataset is now part of the Web of Data
– Convince the owners of related data sets toConvince the owners of related data sets to
auto-generate links to your datasetauto-generate links to your dataset
32. 32 Semantic web: where are we
Serving information as Linked DataServing information as Linked Data
Serving the dataset as static RDF fileServing the dataset as static RDF file
– Which cases?Which cases?
• RDF files are created manuallyRDF files are created manually
• RDF files are created programs that produces fileRDF files are created programs that produces file
outputoutput
– HowHow
• Give .rdf filename extensionGive .rdf filename extension
• In httpd.conf add this lineIn httpd.conf add this line
– AddType application/rdf+xml .rdfAddType application/rdf+xml .rdf
– Problem: 303 redirect does not workProblem: 303 redirect does not work
33. 33 Semantic web: where are we
Serving information as Linked DataServing information as Linked Data
Serving relational databasesServing relational databases
– D2R serverD2R server
• Provides a SPARQL endpoint to the relationalProvides a SPARQL endpoint to the relational
database by the means of mapping filedatabase by the means of mapping file
34. 34 Semantic web: where are we
Serving information as Linked DataServing information as Linked Data
Wrappers around existing applications andWrappers around existing applications and
WEB APIWEB API
– RDF Book MashupRDF Book Mashup
• The RDF Book Mashup assigns a HTTP URI to eachThe RDF Book Mashup assigns a HTTP URI to each
book that has an ISBN numberbook that has an ISBN number
• Whenever the HTTP URI is dereferenceWhenever the HTTP URI is dereferencedd RDF BookRDF Book
Mashup requires data from Amazon API and GoogleMashup requires data from Amazon API and Google
Base APIBase API
35. 35 Semantic web: where are we
Serving information as Linked DataServing information as Linked Data
RDF databaseRDF database
– JenaJena
• SPARQL endpoint is JoSPARQL endpoint is Josekiseki
36. 36 Semantic web: where are we
Testing and debuggingTesting and debugging
RDF browsersRDF browsers
– TabulatorTabulator
– MarblesMarbles
– Open Link RDF BrowserOpen Link RDF Browser
– DiscoDisco
Testing dereferencingTesting dereferencing
– curlcurl
W3C RDF validation serviceW3C RDF validation service
37. 37 Semantic web: where are we
Discovering linked data on the webDiscovering linked data on the web
Ping the semantic webPing the semantic web
– Registry service for RDF documentsRegistry service for RDF documents
HTML link auto-discoveryHTML link auto-discovery
– Set links from existing web pages to RDF dataSet links from existing web pages to RDF data
– HTML <link> element in the <head> of yourHTML <link> element in the <head> of your
HTML pageHTML page
– <link rel="alternate" type="application/rdf+xml"<link rel="alternate" type="application/rdf+xml"
href="link_to_the_RDF_version"/>href="link_to_the_RDF_version"/>
38. 38 Semantic web: where are we
Discovering linked data on the webDiscovering linked data on the web
Semantic Web Crawling: a SitemapSemantic Web Crawling: a Sitemap
ExtensionExtension
– Data publishers can state where RDF is locatedData publishers can state where RDF is located
– Robot.txtRobot.txt
Dataset List on the ESW WikiDataset List on the ESW Wiki
39. 39 Semantic web: where are we
Web of data search enginesWeb of data search engines
FalconsFalcons developed by IWS Chinadeveloped by IWS China
SindiceSindice developed by DERI Irelanddeveloped by DERI Ireland
WatsonWatson developed by KMi, UKdeveloped by KMi, UK
Semantic Web Search Engine (SWSE)Semantic Web Search Engine (SWSE)
developed by DERI Irelanddeveloped by DERI Ireland
SwoogleSwoogle developed by ubiquity group atdeveloped by ubiquity group at
UMBC USAUMBC USA
40. 40 Semantic web: where are we
Semantic web: where are we now?Semantic web: where are we now?
41.
42. 42 Semantic web: where are we
The model in detailThe model in detail
The thing URIThe thing URI
http://nektar.oszk.hu/resource/manifestation/2645471http://nektar.oszk.hu/resource/manifestation/2645471
– The 303 redirection code indicates that this URI is for thingThe 303 redirection code indicates that this URI is for thing
The RDF document URIThe RDF document URI
http://nektar.oszk.hu/data/manifestation/2645471http://nektar.oszk.hu/data/manifestation/2645471
The WEB (LibriVision) document URIThe WEB (LibriVision) document URI
http://nektar.oszk.hu/http://nektar.oszk.hu/huhu/manifestation/2645471/manifestation/2645471
http://nektar.oszk.hu/http://nektar.oszk.hu/enen/manifestation/2645471/manifestation/2645471
43. 43 Semantic web: where are we
The model in detailThe model in detail
Content negotiation rulesContent negotiation rules
– If application/rdf+xml is accepted the xml is givenIf application/rdf+xml is accepted the xml is given
from this address via content negotiation and 303from this address via content negotiation and 303
redirect:redirect:
http://nektar.oszk.hu/data/manifestation/2645471http://nektar.oszk.hu/data/manifestation/2645471
– If text/html is acceptedIf text/html is accepted
• Depending on the language of the browser either theDepending on the language of the browser either the
Hungarian or the English interface of LibriVision is given.Hungarian or the English interface of LibriVision is given.
The default is Hungarian (again via content negotiation):The default is Hungarian (again via content negotiation):
http://nektar.oszk.hu/hu/manifestation/2645471http://nektar.oszk.hu/hu/manifestation/2645471
44. 44 Semantic web: where are we
The working modelThe working model
45. 45 Semantic web: where are we
The record in LibriVisionThe record in LibriVision
46. 46 Semantic web: where are we
The record in LibriVisionThe record in LibriVision
47. 47 Semantic web: where are we
The record in LibriVisionThe record in LibriVision
48. 48 Semantic web: where are we
The record in LibriVisionThe record in LibriVision
49. 49 Semantic web: where are we
The record in LibriVisionThe record in LibriVision
50. 50 Semantic web: where are we
The record in LibriVisionThe record in LibriVision
51. 51 Semantic web: where are we
The record in LibriVisionThe record in LibriVision
52. 52 Semantic web: where are we
The record in LibriVisionThe record in LibriVision
53. 53 Semantic web: where are we
The record in LibriVisionThe record in LibriVision
54. 54 Semantic web: where are we
SKOSSKOS
SKOS is now a W3C recommendationSKOS is now a W3C recommendation
– Last year it just was a proposedLast year it just was a proposed
recommendationrecommendation
NSZL is among the first implementersNSZL is among the first implementers
55. 55 Semantic web: where are we
Useful linksUseful links
How to Publish Linked Data on the WebHow to Publish Linked Data on the Web
– http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/
Cool URIs for the Semantic WebCool URIs for the Semantic Web
– http://www.w3.org/TR/2008/NOTE-cooluris-20080331/http://www.w3.org/TR/2008/NOTE-cooluris-20080331/
SKOS Simple Knowledge Organization SystemSKOS Simple Knowledge Organization System
ReferenceReference
– http://www.w3.org/TR/skos-reference/http://www.w3.org/TR/skos-reference/
SKOS Simple Knowledge Organization System PrimerSKOS Simple Knowledge Organization System Primer
– http://www.w3.org/TR/skos-primer/http://www.w3.org/TR/skos-primer/
SKOS implementation reportSKOS implementation report
– http://www.w3.org/2006/07/SWD/SKOS/reference/20090315/implehttp://www.w3.org/2006/07/SWD/SKOS/reference/20090315/imple
mentation.htmlmentation.html