Linked data HHS 2015

Linked Data,
the Semantic Web, and You
CHIPS Hermon High School 2015

Linked Data
Structured Data on the Web

What is Linked Data
Linked Data refers to a set of best practices for publishing
and connecting structured data on the Web.
URIs - Uniform Resource Identifiers
HTTP - Hypertext Transfer Protocol
RDF - Resource Description Framework
Source: http://linkeddata.org/faq

URIs
Uniform Resource Identifier (URI) is a string of characters used to identify the name of a resource.
Such identification enables interaction with representations of the resource over a network,
typically the World Wide Web, using specific protocols.
Uniform Resource Locator (URL) (commonly informally referred to as a web address, although
the term is not defined identically) URL (uniform resource locator) is a subset of the URIs that
include a network location.
Uniform Resource Name (URN) is a subset of URIs that include a name within a given space, but
no location.
Image Source: http://www.slideshare.net/rodsenra/rest-representational-state-transfer-emc-brdc-internal-tech-talk

HTTP
HyperText Transfer Protocol, HTTP is the underlying
protocol used by the World Wide Web. HTTP defines
how messages are formatted and transmitted, and what
actions Web servers and browsers should take in
response to various commands.

RDF
The Resource Description Framework (RDF) data
model represents information as node-and-arc-labeled
directed graphs. The data model is designed for the
integrated representation of information that originates
from multiple sources, is heterogeneously structured,
and is represented using different schemata. RDF aims
at being employed as a lingua franca, capable of
moderating between other data models that are used on
the Web.
Source: http://linkeddatabook.com/editions/1.0/#htoc16

Graph Database
Steven Roman
foaf:
knows
foaf:
publications
foaf:
publications
Dragons in the Stacks
Cason Snow

RDF - Triples
RDF uses triples to describe resources.
Triples consist of a Subject, Predicate, and Object.
Cason Snow knows Steven Roman.
http://www.library.umaine.edu/staff/snow.htm http://xmlns.com/foaf/0.1/knows https://dkpl.org/teens-tweens/

Serialization
Remember RDF is not a data format. It is a data model for
describing resources.
To display RDF data on the Web it must be serialized
though some kind of syntax.
RDF/XML and RDFa are standardized by the W3C but
many other serializations are out there.

RDF/XML example
1 <?xml version="1.0" encoding="UTF-8"?>
2 <rdf:RDF>
3 xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
4 mlns:foaf="http://xmlns.com/foaf/0.1/">
5 <foaf:Person>
6 <foaf:name>Cason Snow</foaf:name>
7 <foaf:workplaceHomepage rdf:resource=”http://www.library.umaine.edu/staff/snow.htm”/>
8 <foaf:knows>
9 <foaf:Person>
10 <foaf:name>Steven Roman</foaf:name>
11 <foaf:workplaceHomepage rdf:resource=”https://dkpl.org/teens-tweens/”/>
12 </foaf:Person>
13 </foaf:knows>
14 </foaf:Person>
15 </rdf:RDF>

Turtle (Terse RDF Triple
Language) example
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
a foaf:Person ;
foaf:name "Cason Snow" ;
foaf:workplaceHomepage <http://www.library.umaine.edu/staff/snow.htm> ;
foaf:knows [
a foaf:Person ;
foaf:name "Steven Roman"
] .

From HTML to Linked Data

<div>

The Lord of the Rings is an English-language fictional trilogy by J. R. R. Tolkien (1892-1973).


The books in the trilogy are:

<ul>
<li>Vol. 1: The Fellowship of the Ring</li>
<li>Vol. 2: The Two Towers</li>
<li>Vol. 3: The Return of the King</li>
</ul>
</div>

From HTML to Linked Data

<div vocab="http://schema.org/">

<link property="about" href="http://id.worldcat.org/fast/1020337">
The Lord of the Rings is an
English-language
fictional trilogy by

<link property="sameAs" href="http://viaf.org/viaf/95218067">
J. R. R. Tolkien
(1892-1973).

<link property="hasPart" href="#book1">


The books in the trilogy are:

<ul>
<li typeof="Book PublicationVolume" resource="#book1">
Vol. 1:
<link property="about" href="http://id.worldcat.org/fast/1020337">
<link property="isPartOf" href="#trilogy">
<link property="author" href="#author">
<meta property="inLanguage" content="en">
The Fellowship of the Ring
</li>

OSDS
The OpenLink Structured Data Sniffer (OSDS) is a browser extension for
Google Chrome, Mozilla Firefox, and Opera (with builds planned for Apple
Safari, and Microsoft Edge) that unveils metadata oriented structured data
embedded within HTML documents.

Benefits of RDF
in Linked Data
1. By using HTTP URIs as globally unique identifiers for data items as well as for vocabulary terms, the RDF data model is
inherently designed for being used at global scale and enables anybody to refer to anything.
2. Clients can look up any URI in an RDF graph over the Web to retrieve additional information. Thus each RDF triple is part
of the global Web of Data and each RDF triple can be used as a starting point to explore this data space.
3. The data model enables you to set RDF links between data from different sources.
4. Information from different sources can easily be combined by merging the two sets of triples into a single graph.
5. RDF allows you to represent information that is expressed using different schemata in a single graph, meaning that you
can mix terms for different vocabularies to represent data.
6. Combined with schema languages such as RDF-Schema and OWL, the data model allows the use of as much or as little
structure as desired, meaning that tightly structured data as well as semi-structured data can be represented.

Why Linked Data
Linked Data provides a more generic, more flexible publishing paradigm which makes it easier for
data consumers to discover and integrate data from large numbers of data sources. In particular,
Linked Data provides:
• A unifying data model
• A standardized data access mechanism
• Hyperlink-based data discovery
• Self-descriptive data

Semantic Web
Bringing Data to the Web

What is the Semantic Web
The ultimate goal of the Web of data is to enable computers to do more useful
work and to develop systems that can support trusted interactions over the
network.
Semantic Web technologies enable people to create data stores on the Web,
build vocabularies, and write rules for handling data.
Linked data are empowered by technologies such as RDF, SPARQL, OWL,
and SKOS.
Source: http://www.w3.org/standards/semanticweb/

Linked Data
The Semantic Web is a Web of data — of dates and titles and part numbers
and chemical properties and any other data one might conceive of. RDF
provides the foundation for publishing and linking your data.

Vocabularies
At times it may be important or valuable to organize data. Using OWL (to build
vocabularies, or “ontologies”) and SKOS (for designing knowledge
organization systems) it is possible to enrich data with additional meaning,
which allows more people (and more machines) to do more with the data.
Vocabularies are the contextless list of terms used
Ontologies provide the relationships and grammar to express relations
between vocabulary terms

Ontology Examples
Dublin Core - general metadata
Schema.org - general metadata
Friend of a Friend (FOAF) - people related terms
Open Geospatial Consortium - mapping data
vCard - describing people and organizations

Query
Query languages go hand-in-hand with databases. If the Semantic Web is
viewed as a global database, then it is easy to understand why one would
need a query language for that data. SPARQL is the query language for the
Semantic Web.

Inference
Near the top of the Semantic Web stack one finds inference — reasoning
over data through rules. W3C work on rules, primarily through Rule
Interchange Format (RIF) and Web Ontology Language (OWL), is focused
on translating between rule languages and exchanging rules among
different systems.

And You
BIBFRAME and Linked Data Library Projects

BIBFRAME
BIBFRAME (Bibliographic Framework Initiative) is a project by the Library of
Congress to provide a foundation of bibliographic description for integration
in the Web of Data.
● Differentiate clearly between conceptual content and its physical/digital manifestation(s)
● Unambiguously identify information entities (e.g., authorities)
● Leverage and expose relationships between and among entities
Uses RDF serialization.

BIBFRAME
The BIBFRAME Model consists of the following main classes:
Creative Work - a resource reflecting a conceptual essence of the cataloging
item.
Instance - a resource reflecting an individual, material embodiment of the
Work.
Authority - a resource reflecting key authority concepts that have defined
relationships reflected in the Work and Instance. Examples of Authority
Resources include People, Places, Topics, Organizations, etc.
Annotation - a resource that decorates other BIBFRAME resources with
additional information. Examples of such annotations include Library
Holdings information, cover art and reviews.

Creative Work
The Work exists as a Web based control point that reflects both commonality
of content between and among the various Instances associated with the
Work as well as a reference point for other Works. Common properties of
Works include contextual relationships to BIBFRAME Authorities related to
the “subjectness” (Topic, Person, Place, Geographical, etc.) of the resource
as well as the entities (Person, Organization, Meeting, etc.) associated with
its creation. Works can relate to other Works reflecting, for example, part /
whole relationships.

Instance
BIBFRAME Instances reflect an individual, material embodiment of a
BIBFRAME Work that can be physical or digital in nature. A BIBFRAME
Instance exists as a Web based control point that includes properties
specific to the materialization as well as contextual relationships to
appropriate BIBFRAME Authorities related to the publication, production,
distribution of the material resource. Each BIBFRAME Instance is an
instance of one and only one BIBFRAME Work.

Authorities
BIBFRAME Authorities are key authority concepts that are the target of
defined relationships reflected in the Work and Instance. Example of
BIBFRAME Authority Resources include People, Places, Topics,
Organizations, etc. From a cataloging perspective Authorities provide a
means for supporting disambiguation and synchronization around
authoritative information. From a users perspective, BIBFRAME Authorities
provide effective and efficient control points that can be used to help
navigate and contextualize related BIBFRAME Works and Instances.
BIBFRAME Authorities are not designed to compete or replace existing
authority efforts but rather provide a common, light weight abstraction layer
over various different Web based authority efforts to make them even more
effective.

Annotation
Libraries generate, maintain and improve an enormous amount of high-quality
data that is valuable well beyond traditional library boundaries. The
Bibliographic Framework Initiative recognizes this by including as a goal the
ability to “accommodate and distinguish expert-, automated-, and self-
generated metadata, including annotations (reviews, comments) and usage
data.” Rather than pre-define and limit our potential uses of this data, the
BIBFRAME model provides the necessary scaffolding to allows this data to
easily be annotated by libraries as well as third party users of this
information.

BIBFRAME and MARC
BIBFRAME is not ready yet, and it will be a mixed environment for the near
future.
Currently there is a tool to convert MARC to BIBFRAME. It changes fairly
often as BIBFRAME develops.
Currently designed to be rules agnostic, but RDA is a major component.

FAST
Faceted Application of Subject Terminology (FAST) is an enumerative,
faceted subject heading schema derived from the Library of Congress
Subject Headings (LCSH). The purpose of adapting the LCSH with a
simplified syntax to create FAST is to retain the very rich vocabulary of
LCSH while making the schema easier to understand, control, apply, and
use. The schema maintains upward compatibility with LCSH, and any valid
set of LC subject headings can be converted to FAST headings.
Source: http://0-experimental.worldcat.org.library.metmuseum.org/fast/

FAST Linked Data
Linked Data is one of the underpinnings of the Semantic Web, the effort to
make the meaning of information on the Web more understandable to
computers. These Linked Data authorities are formatted using schema.org and
SKOS (Simple Knowledge Organization System).
The FAST Authority file contains links to LCSH Authorities as will as other
authoritative sources such as VIAF, GeoNames, and Wikipedia. We will
continue to add other links where possible.
Source: http://0-experimental.worldcat.org.library.metmuseum.org/fast/

European Library
The European Library is an independent not-for-profit library services
organisation supported by CENL, LIBER and CERL. The European Library
importantly works to strengthen and support libraries across the continent.
Member libraries benefit from a powerful, low-cost aggregation structure
enabling a greater exposure of digital resources and bibliographic records. We
collect, enrich and innovate with libraries' data and content for the widest
possible dissemination.
The European Library's mission is to be THE open data hub for library data in
Europe.
Additionally, The European Library partakes in projects to create useful tools
and a pan-European infrastructure for librarians and researchers.

British National Bibliography
The BNB Linked Data Platform provides access to the British National
Bibliography published as linked open data and made available through
SPARQL services. Two different interfaces are provided: a SPARQL
editor, and /sparql a service endpoint for remote queries. Alternatively, use
the search box below to enter a plain text term.
The Linked Open BNB is a subset of the full British National Bibliography. It
currently includes published books (including monographs published over
time) and serial publications, representing approximately 2.8 million records.

Deutsche National
Bibliothek
The German National Library is building a linked data service that in the long run will permit the
semantic web community to use the entire stock of national bibliographic data, including all authority
data. It is endeavouring to make a contribution to the global information infrastructure with this new
data service and thus laying the foundations for modern commercial and non-commercial web
services.
The German National Library is committed to making a significant contribution to ensuring the stability
and reliability of the "linked data cloud" by providing data of high quality, most of which has been
intellectually generated. The German National Library has been supplying its data in the RDF standard
via the Linked Data Service since 2010. By offering RDF as an equal status export format, it allows
users and user groups to re-use its data in a way which requires no knowledge of bibliographic
formats.

Linked Data for Libraries
LD4L
The goal of the project is to create a Scholarly Resource Semantic
Information Store (SRSIS) model that works both within individual
institutions and through a coordinated, extensible network of Linked Open
Data to capture the intellectual value that librarians and other domain
experts and scholars add to information resources when they describe,
annotate, organize, select, and use those resources, together with the
social value evident from patterns of usage.

Further resources
Reading
BIBFRAME
Primer
XML
RDF
Education
W3Schools XML Tutorial
Structured data with schema.org codelab
Semantic Web Primer

Contact information
Cason Snow
Metadata Librarian/Cataloger
University of Maine
207-581-1670

Linked data HHS 2015

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Destacado

Destacado (7)

Similar a Linked data HHS 2015

Similar a Linked data HHS 2015 (20)

Más de Cason Snow

Más de Cason Snow (7)

Último

Último (20)

Linked data HHS 2015

Notas del editor