SlideShare una empresa de Scribd logo
1 de 72
ResourceSync for Semantic Web
Data Copying and Synchronization
Simeon Warner (Cornell University)
http://orcid.org/0000-0002-7970-7855
SWIB13, Hamburg, Germany
2013-11-27
Menu
1. A personal spin
2. ResourceSync
a. ResourceSync: Problem Perspective &
Conceptual Approach
b. Motivation & Use Cases
c. Framework Walkthrough
d. Framework Technical Details
e. Implementation

3. ResourceSync and the Semantic Web
Typical morning, summer 1996
Linked world – but no data

1. Names for
articles,
people
2. HTTP to get
data
3. (no machine
data)
4. Have links to
other things
Code for RDF/XML and Turtle
support contributed to ORCID
by Stian Soiland-Reyes
Discovery at Cornell
 linked data
ResourceSync:
A Web-Based
Resource Synchronization
Framework

These following slides are excerpted from the ResourceSync tutorial.
The most recent version of the full tutorial slides is available at
http://www.slideshare.net/OpenArchivesInitiative/resourcesync-tutorial

#resourcesync

ResourceSync is funded by
The Sloan Foundation & JISC
ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

14
OAI
Herbert Van de Sompel
Martin Klein
Robert Sanderson
(Los Alamos National Laboratory)
Simeon Warner
(Cornell University)

NISO
Todd Carpenter
Nettie Lagace
University of Oxford
Graham Klyne

Bernhard Haslhofer
(University of Vienna)
Michael L. Nelson
(Old Dominion University)

Lyrasis
Peter Murray

Carl Lagoze
(University of Michigan)

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

15
ResourceSync Technical Group
LOCKSS
Ex Libris Inc.
Shlomo Sanders

David Rosenthal

JISC
Paul Walk
Richard Jones
Stuart Lewis

RedHat
OCLC
Christian Sadilek

Library of Congress

Jeff Young

Kevin Ford

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

16
Timeline, Status of Specification(s)
• August 2013
o

o

Release of ResourceSync framework Core specification
- Version 0.9.1
Public draft of ResourceSync Archives specification released

• September 2013
o

Core specification on its way to become an ANSI standard

• November 2013
o

Internal draft of ResourceSync Notification specification

• January 2014
o

Public draft of ResourceSync Notification specification

• Mid 2014
o

Core specification becomes ANSI/NISO standard

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

17
Menu
1. A personal spin
2. ResourceSync
a. ResourceSync: Problem Perspective &
Conceptual Approach
b. Motivation & Use Cases
c. Framework Walkthrough
d. Framework Technical Details
e. Implementation

3. ResourceSync and the Semantic Web
Synchronize What?
• Web resources
o things with a URI that can be dereferenced
• Focus on needs of research communication and cultural heritage
organizations but aim for generality
• Small websites/repositories (a few resources) to large
repositories/datasets/linked data collections (many millions of
resources)
• Low change frequency (weeks/months) to high change
frequency (seconds)
• Synchronization latency and accuracy needs may vary

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

19
ResourceSync Problem
• Consider:
• Source (server) A has resources that change over time: they
get created, modified, deleted
• Destination (servers) X, Y, and Z leverage (some)
resources of Source A.
• Problem:
• Destinations want to keep in step with the resource changes
at Source A
• Goal:
• Design an approach for resource synchronization aligned
with the Web Architecture that has a fair chance of adoption
by different communities.
• The approach must scale better than recurrent HTTP
HEAD/GET on resources.

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

20
Destination: Synchronization Needs
1. Baseline synchronization – A destination must be able to
perform an initial load or catch-up with a source
- avoid out-of-band setup
2. Incremental synchronization – A destination must have some
way to keep up-to-date with changes at a source
- subject to some latency; minimal: create/update/delete
- allow to catch-up after destination has been offline
3. Audit – A destination should be able to determine whether it is
synchronized with a source
- regarding coverage and accuracy

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

21
Didn’t you sell us OAI-PMH?
Or... will ResourceSync replace OAI-PMH?
 Proven XML metadata transfer protocol
 Libraries in a number of programming languages
 Widely adopted in our community
X
X
X

Predates REST, not “of the web”
Not adopted for content transfer
Technical issues with sets

• Devise a shared solution for data, metadata, linked data?
ResourceSync may replace, will likely coexistence

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

22
Menu
1. A personal spin
2. ResourceSync
a. ResourceSync: Problem Perspective &
Conceptual Approach
b. Motivation & Use Cases
c. Framework Walkthrough
d. Framework Technical Details
e. Implementation

3. ResourceSync and the Semantic Web
Use Cases – The Basics

a)

b)

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

24
Use Cases – The Basics
c)

d)
ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

25
Use Cases – The not-so-Basics

e)

f)
ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

26
Use Case 1: arXiv Mirroring and Data Sharing
• Repository of scholarly articles in physics,
mathematics, computer science, etc.
• > 880k articles, ~1.5 revisions per article
• ~75k new articles per year
• metadata, source, PDF
• ~3.8M resources
• ~2700 updates/day
• Support
o
o

Mirroring
Sharing

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

27
Use Case 2: DBpedia Live Duplication
• Average of 2 updates per second
• Low latency desirable => need for a push technology

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

28
Use Case 2: DBpedia Live Duplication
• Daily traffic:
o 99% updates
o 0.6% deletions
o 0.03% creations

• LANL
experiments with
push-based sync

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

29
Menu
1. A personal spin
2. ResourceSync
a. ResourceSync: Problem Perspective &
Conceptual Approach
b. Motivation & Use Cases
c. Framework Walkthrough
d. Framework Technical Details
e. Implementation

3. ResourceSync and the Semantic Web
Source: Core Synchronization Capabilities

P
U
L
L

1. Describing content – publish a list of resources available for
synchronization to enable Destinations to perform an initial load
or catch-up with a Source
2. Packaging content – bundle resources to enable bulk download
by destinations
3. Describing changes – publish a list of resource changes to
enable destinations to stay synchronized and decrease latency
4. Packaging changes – bundle resource changes for bulk
download by destinations

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

31
Source Capability 1: Describing Content
In order to advertise the resources that a source wants destinations
to know about, it may describe them:
o

o

Publish a Resource List, a list of resource URIs and possibly
associated metadata
- Destination GETs the Resource List
- Destination GETs listed resources by their URI
A Resource List describes the state of a set of resources at
one point in time (snapshot)

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

32
33
34
Source Capability 2: Packaging Content
By default, content is transferred in response to a GET issued by a
destination against a URI of a source’s resource. But a source may
support additional mechanisms:
o

o

Publish a Resource Dump, a document that points to
packages of resource representations and necessary
metadata
- Destination GETs the package
- Destination unpacks the package
- ZIP format supported
A Resource Dump and the packages it points to reflect the
state of a set of resources at one point in time (snapshot)

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

35
36
Source:
Modular Capabilities

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

37
Source Capability 3: Describing Changes
In order to achieve lower latency and/or greater efficiency, a source
may communicate about changes to its resources:
o

o

Publish a Change List, a list of recent change events
(created, updated, deleted resource)
- Destination acts upon change events, e.g. GETs
created/updated resources, removes deleted resources.
A Change List pertains to resources that changed in a
temporal interval with a start- and an end-date
- If a resource changed more than once, it will be listed
more than once

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

38
39
Destination: Key Processes

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

40
Menu
1. A personal spin
2. ResourceSync
a. ResourceSync: Problem Perspective &
Conceptual Approach
b. Motivation & Use Cases
c. Framework Walkthrough
d. Framework Technical Details
e. Implementation

3. ResourceSync and the Semantic Web
Many technology options
Push

DSNotify
OAI-PMH
rsync

Crawl

Pull
OAI-ORE

RDFsync

WebDAV Col. Syn.

XMPP
Atom

SWORD
Sitemap

SPARQLpush

SDShare

AtomPub

RSS
PubSubHubbub

XMPP
ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

42
Many technology options
Push

DSNotify
OAI-PMH
rsync

Crawl

Pull
OAI-ORE

RDFsync

WebDAV Col. Syn.

XMPP
Atom

SWORD
Sitemap

SPARQLpush

SDShare

AtomPub

RSS
PubSubHubbub

XMPP
ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

43
• Modular framework allowing
selective deployment
• Sitemap is the core format
throughout the framework
o

o

o

Introduce extension elements and
attributes:
- In ResourceSync namespace (rs:) to
accommodate synchronization needs
Reuse Sitemap format for all capability
documents
Utilize Sitemap index format where
ResourceSync
needed/allowed
SWIB13, Hamburg, Germany, 2013-11-27

44
Sitemap Format

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9”>
<url>
<loc>http://example.com/res1</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
</url>

<url>
<loc>http://example.com/res2</loc>
<lastmod>2013-01-02T14:00:00Z</lastmod>
</url>
…
Use <sitemapindex>
</urlset>

if >50k items

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

45
ResourceSync Sitemap Extensions
<urlset xmlns=http://www.sitemaps.org/schemas/sitemap/0.9
xmlns:rs="http://www.openarchives.org/rs/terms/”>
<rs:ln …/>
<rs:md …/>
<url>
<loc>http://example.com/res1</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:ln …/>
<rs:md …/>
</url>
<url>
…
</url>
Same extensions in
</urlset>

<sitemapindex>

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

46
Resource Metadata Summary
Element/Attribute
<loc>
<lastmod>

Description
Resource URI (identity)
Timestamp of last change

Defined by
sitemaps
sitemaps

<changefreq>

Expected update frequency

sitemaps

<rs:md>
change
encoding

hash
length
path
type

ResourceSync
Change type (Change List & Change
Dump Manifest only)

ResourceSync

HTTP Content-Encoding header value

RFC2616

One or more content digests (md5, sha-1, Atom Link Ext.
sha-256)

HTTP Content-Length header value

RFC4287

Path in ZIP package (Dump Manifests
only)
HTTP Content-Type header value

ResourceSync

RFC4287

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27
Link Relation Summary
Relation

Use in ResourceSync

Defined in

rel="alternate"

Link from generic to specific URI

HTML 5

rel="canonical"

Link from specific to generic URI

RFC6596

rel="collection"

Resource is member of collection

RFC6573

rel="contents"

Link from dump to manifest

rel="describedby"

Has metadata

HTML4
Protocol for Web Description Resources
(POWDER): Description Resources

rel="describes"

Is metadata for

The 'describes' Link Relation Type

rel="duplicate"

RFC6249

rel=".../rs/terms/patch"

Mirror or alternative copy
A patch -- efficient change
information

rel="memento"

Link to time-specific URI

Memento Internet Draft

rel="timegate"

Link to timegate

Memento Internet Draft

rel="via"

Provenance chain, came from

RFC4287

This specification

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27
ResourceSync Sitemap Validation
• All ResourceSync capability documents are valid according to
the Sitemap XML Schema
o

http://www.sitemaps.org/schemas/sitemap/0.9

• For a more thorough validation use the ResourceSync XML
Schema
o

http://www.openarchives.org/rs/0.9.1/resourcesync.xsd

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27
Example document: Resource List
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability="resourcelist"
at="2013-01-03T09:00:00Z”
completed="2013-01-03T09:01:00Z” />
<url>
<loc>http://example.com/res1</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:md hash="md5:1584abdf8ebdc9802ac0c6a7402c03b6"
length="8876"
type="text/html"/>
</url>
• Describe Source’s resources subject to synchronization
<url>
• At one point in time (snapshot)
…
• Creation can take some time – duration can be
</url>
conveyed
</urlset>

• HTTP GET resources
ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

50
Framework Structure
(without possible index documents)

http://www.openarchives.org/rs/resourcesync#Structure
ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

51
Supported Linking Use Cases
Provide links to related resources to address specific needs:
1. Mirrored content with multiple download locations
2. Alternate representations of the same content
• Resources subject to HTTP content negotiation
• Format migration for preservation reasons
3. Patching content rather than replacing it
4. Resources and metadata about resources
5. Prior versions of resources
6. Collection membership of resources
7. Republishing synchronized resources
All cases use <rs:ln> element referring to the linked resource
ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

52
Linking – Alternate Representations – Case 1
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”changelist"
from="2013-01-02T09:00:00Z”
until="2013-01-03T09:00:00Z”/>
<url>
<loc>http://example.com/res1</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:md change=”updated”/>
<rs:ln rel="alternate"
type="text/html"
href="http://example.com/res1.html"/>
<rs:ln rel="alternate"
type=“application/pdf"
href=”http://example.com/res1.pdf"/>
</url>
</urlset>
Canonical URI links to specific URIs

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

53
Linking – Alternate Representations – Case 2
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:rs="http://www.openarchives.org/rs/terms/">
<rs:md capability=”changelist"
from="2013-01-02T09:00:00Z”
until="2013-01-03T09:00:00Z”/>
<url>
<loc>http://example.com/res1.html</loc>
<lastmod>2013-01-02T13:00:00Z</lastmod>
<rs:md change=”updated”/>
<rs:ln rel=”canonical”
href="http://example.com/res1"/>
</url>
</urlset>

Specific URI links to canonical URI

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

54
Source: Notification Capabilities
Motivations: reduce synchronization latency, avoid polling
•

P
U
•
S
H

1. Change Notification
• Notifies destination about changes to particular resources
• e.g., resource A has been updated | created | deleted
2. Framework Notification
• Notifies destination about changes to capabilities i.e., their
documents
• e.g., a Change List has been updated | created | deleted
• Also for Capability Lists and Source Description
Investigating Pubsubhubbub as transport first, may look at
WebSockets later

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

55
Polling sucks
Polling sucks.
A
R
C
H
I
V
E
S

Source: Archival Capabilities
The Source may hold on to historical data, for example, to allow
Destinations to catch up with events they missed or revisit prior
resource states. To this end, the Source can publish archives, i.e.
documents that enumerate historical capability documents
1.
2.
3.
4.

Resource List Archive
Resource Dump Archive
Change List Archive
Change Dump Archive

Re-use same document formats to list archived sets of
corresponding documents, discovery entries tie together

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

57
Menu
1. A personal spin
2. ResourceSync
a. ResourceSync: Problem Perspective &
Conceptual Approach
b. Motivation & Use Cases
c. Framework Walkthrough
d. Framework Technical Details
e. Implementation

3. ResourceSync and the Semantic Web
DSpace support for
metadata harvesting use case

DSpace Module:
https://github.com/CottageLabs/DSpaceResourceSync
PHP client:
https://github.com/stuartlewis/resync-php
http://mydspace.edu/dspace-rs/resource/123456789/7/qdc
ResourceSync webapp

Item handle

Metadata Format

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

59
ResourceSync @ arXiv
• Use ResourceSync for both
mirroring and public data access
o efficient updates
o ability to do periodic audits
o public synchronization capability
o reduce admin burden
• Start with metadata + source for
mirroring use case (doing
experiments now)
• Open Access use cases require
processed PDF also
ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

60
Python Library and Client
• Aim to provide library code implementing all ResourceSync
facilities for use in both source and destination implementations
o
o

Designed for python 2.6 (RHEL6) and 2.7
Will not work with python <= 2.5

• Client (resync) supports many destination operations, inspired
by the common Unix rsync program
• Client also supports some operations that might be useful in a
source, such as generation of static Resource Lists, or periodic
Change Lists (used in arXiv experiments)
• Explorer (resync-explorer) intended to allow easy inspection
of a source’s resource sets and capabilities
• Developed since ResourceSync v0.5, updated for v0.9.1
http://github.org/resync/resync

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27
ResourceSync Source Simulator
• Python code using Tornado server
• Provides random set of resources of different sizes updated at a
particular rate
• Very useful for testing Destination code

http://github.com/resync/simulator

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27
Menu
1. A personal spin
2. ResourceSync
a. ResourceSync: Problem Perspective &
Conceptual Approach
b. Motivation & Use Cases
c. Framework Walkthrough
d. Framework Technical Details
e. Implementation

3. ResourceSync and the Semantic Web
Linked data
Fundamentally distributed but local copy often
required. Either:
MusicBrainz

1. cache

Last.FM

2. sync local copy...

BBC

DBpedia

GeoNames
others...

• Many ad-hoc solutions
for local copy
64
How do you get your semantic data?
• @edsu -> @LibSkrat: “not at http://id.loc.gov no; someone
could download the triples and create their own though”
• Philipp Zumstein – “copy the RDF/XML files”
• Valeria Pesce – “harvest XML and CSV, then map to extended
VIVO ontology”
• ...
• Poll on semantic data storage: triple store? files? RDB? other?
– results: ~6 triple, ~6 files, ~1 RDB, ~1 MongoDB
Semantic data synchronization
• Is your data nice linked data? URIs that resolve to
other documents, etc.
– everything is a web resource so good match for
ResourceSync
– maybe the web already provides adequate access?

• Are you able to tell what has changed?
– in most triple stores there is no timestamp so
providing subsets of changed data might be hard

• Look at four scenarios...
Linked data
Sematic data on the web – great match for ResourceSync
Consider a linked data system has some convenient way to
generate and/or keep track of fixity information (datestamps,
hashes, etc.) for all of its resource representations, then this
may be an effective way to synchronize with ResourceSync.
• Usual ResourceSync mechanisms including Resource Lists,
Change Lists, Dumps, Notifications and Archives all
applicable.
• Complications with triple store
– How to generate fixity information?
– Cost of generating set of self-contained representations (e.g.
concise bounded description) may be high
Service level notification
Image a set of RDF data updated periodically,
goal is just to let consumers know that changes
have been made
• Perhaps many sets or subsets provided
• Need to give service a URI which is then listed
in Resource List etc.
Dumps as resources to sync
• Might have dump that is in any format, it is just a resource
on the web
• Almost trivial but fits very cleanly in framework
• Would work well with framework notification
• Use link relation might also be used to indicate sequence if
there are a set of dumps from different times
Note: Somewhat different from a ResourceSync Resource
Dump (or Change Dump) which is something where the data
is represented as resources to be synchronized
Diffs or patches for RDF data
Open question: it might be possible to use RDF patching
mechanisms (perhaps JSON-PATCH with JSON-LD) to
provide efficient updates of RDF datasets
• Trivial for a dataset with no blank nodes,
• Diffs progressively more difficult and less efficient if
there are many blank nodes
• Particularly useful/efficient for large datasets with
small changes
ResourceSync provides mechanism to link any patch
format and file, and relate to the resource patched
71
Pointers
• Specification
http://www.openarchives.org/rs/
http://www.openarchives.org/rs/resourcesync
http://www.openarchives.org/rs/archives

• List for public comment
https://groups.google.com/d/forum/resourcesync
• Client and simulator code
http://github.org/resync/resync
http://github.org/resync/simulator

ResourceSync
SWIB13, Hamburg, Germany, 2013-11-27

72

Más contenido relacionado

La actualidad más candente

Data infrastructure at Facebook
Data infrastructure at Facebook Data infrastructure at Facebook
Data infrastructure at Facebook AhmedDoukh
 
Cloud Services for Big Data Analytics
Cloud Services for Big Data AnalyticsCloud Services for Big Data Analytics
Cloud Services for Big Data AnalyticsGeoffrey Fox
 
Matching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesMatching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesGeoffrey Fox
 
High Performance Processing of Streaming Data
High Performance Processing of Streaming DataHigh Performance Processing of Streaming Data
High Performance Processing of Streaming DataGeoffrey Fox
 
GlobusWorld 2021: Managing Genomics Data at the DOE Joint Genomics Institute
GlobusWorld 2021: Managing Genomics Data at the DOE Joint Genomics InstituteGlobusWorld 2021: Managing Genomics Data at the DOE Joint Genomics Institute
GlobusWorld 2021: Managing Genomics Data at the DOE Joint Genomics InstituteGlobus
 
ResourceSync: Conceptual and Technical Problem Perspective
ResourceSync: Conceptual and Technical Problem PerspectiveResourceSync: Conceptual and Technical Problem Perspective
ResourceSync: Conceptual and Technical Problem PerspectiveHerbert Van de Sompel
 
Introduction to Big Data and Science Clouds (Chapter 1, SC 11 Tutorial)
Introduction to Big Data and Science Clouds (Chapter 1, SC 11 Tutorial)Introduction to Big Data and Science Clouds (Chapter 1, SC 11 Tutorial)
Introduction to Big Data and Science Clouds (Chapter 1, SC 11 Tutorial)Robert Grossman
 
Hadoop MapReduce Framework
Hadoop MapReduce FrameworkHadoop MapReduce Framework
Hadoop MapReduce FrameworkEdureka!
 
Archiving data from Durham to RAL using the File Transfer Service (FTS)
Archiving data from Durham to RAL using the File Transfer Service (FTS)Archiving data from Durham to RAL using the File Transfer Service (FTS)
Archiving data from Durham to RAL using the File Transfer Service (FTS)Jisc
 
A Data Ecosystem to Support Machine Learning in Materials Science
A Data Ecosystem to Support Machine Learning in Materials ScienceA Data Ecosystem to Support Machine Learning in Materials Science
A Data Ecosystem to Support Machine Learning in Materials ScienceGlobus
 
BioCASE web services for germplasm data sets, at FAO, Rome (2006)
BioCASE web services for germplasm data sets, at FAO, Rome (2006)BioCASE web services for germplasm data sets, at FAO, Rome (2006)
BioCASE web services for germplasm data sets, at FAO, Rome (2006)Dag Endresen
 
Improving access to geospatial Big Data in the hydrology domain
Improving access to geospatial Big Data in the hydrology domainImproving access to geospatial Big Data in the hydrology domain
Improving access to geospatial Big Data in the hydrology domainClaudia Vitolo
 
Introduction to Apache Hadoop Ecosystem
Introduction to Apache Hadoop EcosystemIntroduction to Apache Hadoop Ecosystem
Introduction to Apache Hadoop EcosystemMahabubur Rahaman
 
Enabling efficient movement of data into & out of a high-performance analysis...
Enabling efficient movement of data into & out of a high-performance analysis...Enabling efficient movement of data into & out of a high-performance analysis...
Enabling efficient movement of data into & out of a high-performance analysis...Jisc
 
Mansi chowkkar programming_in_data_analytics
Mansi chowkkar programming_in_data_analyticsMansi chowkkar programming_in_data_analytics
Mansi chowkkar programming_in_data_analyticsMansiChowkkar
 

La actualidad más candente (19)

Big Data Processing
Big Data ProcessingBig Data Processing
Big Data Processing
 
Data infrastructure at Facebook
Data infrastructure at Facebook Data infrastructure at Facebook
Data infrastructure at Facebook
 
Cloud Services for Big Data Analytics
Cloud Services for Big Data AnalyticsCloud Services for Big Data Analytics
Cloud Services for Big Data Analytics
 
Matching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software ArchitecturesMatching Data Intensive Applications and Hardware/Software Architectures
Matching Data Intensive Applications and Hardware/Software Architectures
 
High Performance Processing of Streaming Data
High Performance Processing of Streaming DataHigh Performance Processing of Streaming Data
High Performance Processing of Streaming Data
 
ResourceSync
ResourceSyncResourceSync
ResourceSync
 
GlobusWorld 2021: Managing Genomics Data at the DOE Joint Genomics Institute
GlobusWorld 2021: Managing Genomics Data at the DOE Joint Genomics InstituteGlobusWorld 2021: Managing Genomics Data at the DOE Joint Genomics Institute
GlobusWorld 2021: Managing Genomics Data at the DOE Joint Genomics Institute
 
ResourceSync: Conceptual and Technical Problem Perspective
ResourceSync: Conceptual and Technical Problem PerspectiveResourceSync: Conceptual and Technical Problem Perspective
ResourceSync: Conceptual and Technical Problem Perspective
 
hadoop
hadoophadoop
hadoop
 
Introduction to Big Data and Science Clouds (Chapter 1, SC 11 Tutorial)
Introduction to Big Data and Science Clouds (Chapter 1, SC 11 Tutorial)Introduction to Big Data and Science Clouds (Chapter 1, SC 11 Tutorial)
Introduction to Big Data and Science Clouds (Chapter 1, SC 11 Tutorial)
 
Hadoop MapReduce Framework
Hadoop MapReduce FrameworkHadoop MapReduce Framework
Hadoop MapReduce Framework
 
poster draft 5
poster draft 5poster draft 5
poster draft 5
 
Archiving data from Durham to RAL using the File Transfer Service (FTS)
Archiving data from Durham to RAL using the File Transfer Service (FTS)Archiving data from Durham to RAL using the File Transfer Service (FTS)
Archiving data from Durham to RAL using the File Transfer Service (FTS)
 
A Data Ecosystem to Support Machine Learning in Materials Science
A Data Ecosystem to Support Machine Learning in Materials ScienceA Data Ecosystem to Support Machine Learning in Materials Science
A Data Ecosystem to Support Machine Learning in Materials Science
 
BioCASE web services for germplasm data sets, at FAO, Rome (2006)
BioCASE web services for germplasm data sets, at FAO, Rome (2006)BioCASE web services for germplasm data sets, at FAO, Rome (2006)
BioCASE web services for germplasm data sets, at FAO, Rome (2006)
 
Improving access to geospatial Big Data in the hydrology domain
Improving access to geospatial Big Data in the hydrology domainImproving access to geospatial Big Data in the hydrology domain
Improving access to geospatial Big Data in the hydrology domain
 
Introduction to Apache Hadoop Ecosystem
Introduction to Apache Hadoop EcosystemIntroduction to Apache Hadoop Ecosystem
Introduction to Apache Hadoop Ecosystem
 
Enabling efficient movement of data into & out of a high-performance analysis...
Enabling efficient movement of data into & out of a high-performance analysis...Enabling efficient movement of data into & out of a high-performance analysis...
Enabling efficient movement of data into & out of a high-performance analysis...
 
Mansi chowkkar programming_in_data_analytics
Mansi chowkkar programming_in_data_analyticsMansi chowkkar programming_in_data_analytics
Mansi chowkkar programming_in_data_analytics
 

Similar a ResourceSync Semantic Web Data

ResourceSync Tutorial from Open Repositories 2013
ResourceSync Tutorial from Open Repositories 2013ResourceSync Tutorial from Open Repositories 2013
ResourceSync Tutorial from Open Repositories 2013Simeon Warner
 
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...Martin Klein
 
Resource sync overview and real-world use cases for discovery, harvesting, an...
Resource sync overview and real-world use cases for discovery, harvesting, an...Resource sync overview and real-world use cases for discovery, harvesting, an...
Resource sync overview and real-world use cases for discovery, harvesting, an...openminted_eu
 
RDF Stream Processing Tutorial: RSP implementations
RDF Stream Processing Tutorial: RSP implementationsRDF Stream Processing Tutorial: RSP implementations
RDF Stream Processing Tutorial: RSP implementationsJean-Paul Calbimonte
 
Social Computing Research with Apache Spark
Social Computing Research with Apache SparkSocial Computing Research with Apache Spark
Social Computing Research with Apache SparkMatthew Rowe
 
Repository Deposit Service Description
Repository Deposit Service DescriptionRepository Deposit Service Description
Repository Deposit Service DescriptionJulie Allinson
 
Nyc web perf-final-july-23
Nyc web perf-final-july-23Nyc web perf-final-july-23
Nyc web perf-final-july-23Dan Boutin
 
Wed garcia hands_on_d_bpedia preservation
Wed garcia hands_on_d_bpedia preservationWed garcia hands_on_d_bpedia preservation
Wed garcia hands_on_d_bpedia preservationeswcsummerschool
 
Open Archives Initiative Object Reuse and Exchange
Open Archives Initiative Object Reuse and ExchangeOpen Archives Initiative Object Reuse and Exchange
Open Archives Initiative Object Reuse and Exchangelagoze
 
Open Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked DataOpen Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked DataPascal-Nicolas Becker
 
Adoption of the Linked Data Best Practices in Different Topical Domains
Adoption of the Linked Data Best Practices in Different Topical DomainsAdoption of the Linked Data Best Practices in Different Topical Domains
Adoption of the Linked Data Best Practices in Different Topical DomainsChris Bizer
 
Improving library services with semantic web technology in the realm of repos...
Improving library services with semantic web technology in the realm of repos...Improving library services with semantic web technology in the realm of repos...
Improving library services with semantic web technology in the realm of repos...redsys
 
Rpi talk foster september 2011
Rpi talk foster september 2011Rpi talk foster september 2011
Rpi talk foster september 2011Ian Foster
 
Duraspace Hot Topics Series 6: Metadata and Repository Services
Duraspace Hot Topics Series 6: Metadata and Repository ServicesDuraspace Hot Topics Series 6: Metadata and Repository Services
Duraspace Hot Topics Series 6: Metadata and Repository ServicesMatthew Critchlow
 
COLLECTION METHODS
COLLECTION METHODSCOLLECTION METHODS
COLLECTION METHODSEssam Obaid
 

Similar a ResourceSync Semantic Web Data (20)

ResourceSync Tutorial from Open Repositories 2013
ResourceSync Tutorial from Open Repositories 2013ResourceSync Tutorial from Open Repositories 2013
ResourceSync Tutorial from Open Repositories 2013
 
ResourceSync tutorial OAI8
ResourceSync tutorial OAI8ResourceSync tutorial OAI8
ResourceSync tutorial OAI8
 
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
ResourceSync - Overview and Real-World Use Cases for Discovery, Harvesting, a...
 
Resource sync overview and real-world use cases for discovery, harvesting, an...
Resource sync overview and real-world use cases for discovery, harvesting, an...Resource sync overview and real-world use cases for discovery, harvesting, an...
Resource sync overview and real-world use cases for discovery, harvesting, an...
 
03 data mining : data warehouse
03 data mining : data warehouse03 data mining : data warehouse
03 data mining : data warehouse
 
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
NISO Forum, Denver, September 24, 2012: ResourceSync: Web-Based Resource Sync...
 
RDF Stream Processing Tutorial: RSP implementations
RDF Stream Processing Tutorial: RSP implementationsRDF Stream Processing Tutorial: RSP implementations
RDF Stream Processing Tutorial: RSP implementations
 
Social Computing Research with Apache Spark
Social Computing Research with Apache SparkSocial Computing Research with Apache Spark
Social Computing Research with Apache Spark
 
Repository Deposit Service Description
Repository Deposit Service DescriptionRepository Deposit Service Description
Repository Deposit Service Description
 
Nyc web perf-final-july-23
Nyc web perf-final-july-23Nyc web perf-final-july-23
Nyc web perf-final-july-23
 
Wed garcia hands_on_d_bpedia preservation
Wed garcia hands_on_d_bpedia preservationWed garcia hands_on_d_bpedia preservation
Wed garcia hands_on_d_bpedia preservation
 
Sword Bl 0903[1]
Sword Bl 0903[1]Sword Bl 0903[1]
Sword Bl 0903[1]
 
Open Archives Initiative Object Reuse and Exchange
Open Archives Initiative Object Reuse and ExchangeOpen Archives Initiative Object Reuse and Exchange
Open Archives Initiative Object Reuse and Exchange
 
Open Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked DataOpen Science Days 2014 - Becker - Repositories and Linked Data
Open Science Days 2014 - Becker - Repositories and Linked Data
 
Adoption of the Linked Data Best Practices in Different Topical Domains
Adoption of the Linked Data Best Practices in Different Topical DomainsAdoption of the Linked Data Best Practices in Different Topical Domains
Adoption of the Linked Data Best Practices in Different Topical Domains
 
Linked Data and Semantic Web Application Development by Peter Haase
Linked Data and Semantic Web Application Development by Peter HaaseLinked Data and Semantic Web Application Development by Peter Haase
Linked Data and Semantic Web Application Development by Peter Haase
 
Improving library services with semantic web technology in the realm of repos...
Improving library services with semantic web technology in the realm of repos...Improving library services with semantic web technology in the realm of repos...
Improving library services with semantic web technology in the realm of repos...
 
Rpi talk foster september 2011
Rpi talk foster september 2011Rpi talk foster september 2011
Rpi talk foster september 2011
 
Duraspace Hot Topics Series 6: Metadata and Repository Services
Duraspace Hot Topics Series 6: Metadata and Repository ServicesDuraspace Hot Topics Series 6: Metadata and Repository Services
Duraspace Hot Topics Series 6: Metadata and Repository Services
 
COLLECTION METHODS
COLLECTION METHODSCOLLECTION METHODS
COLLECTION METHODS
 

Más de Simeon Warner

Questioning Authority Lookup Service: Linking the Data
Questioning Authority Lookup Service: Linking the DataQuestioning Authority Lookup Service: Linking the Data
Questioning Authority Lookup Service: Linking the DataSimeon Warner
 
OCFL: A Shared Approach to Preservation Persistence
OCFL: A Shared Approach to Preservation PersistenceOCFL: A Shared Approach to Preservation Persistence
OCFL: A Shared Approach to Preservation PersistenceSimeon Warner
 
The Oxford Common File Layout: A common approach to digital preservation
The Oxford Common File Layout: A common approach to digital preservationThe Oxford Common File Layout: A common approach to digital preservation
The Oxford Common File Layout: A common approach to digital preservationSimeon Warner
 
Welcome to the FOLIO Community
Welcome to the FOLIO CommunityWelcome to the FOLIO Community
Welcome to the FOLIO CommunitySimeon Warner
 
Sinopia & FOLIO: Bridging the gap to linked data cataloging
Sinopia & FOLIO: Bridging the gap to linked data cataloging Sinopia & FOLIO: Bridging the gap to linked data cataloging
Sinopia & FOLIO: Bridging the gap to linked data cataloging Simeon Warner
 
FOLIO and Linked Data
FOLIO and Linked DataFOLIO and Linked Data
FOLIO and Linked DataSimeon Warner
 
IIIF Technical Specification Status Update
IIIF Technical Specification Status UpdateIIIF Technical Specification Status Update
IIIF Technical Specification Status UpdateSimeon Warner
 
Don't bold the field name!
Don't bold the field name!Don't bold the field name!
Don't bold the field name!Simeon Warner
 
Samvera and IIIF 2018
Samvera and IIIF 2018Samvera and IIIF 2018
Samvera and IIIF 2018Simeon Warner
 
Oxford Common File Layout (OCFL)
Oxford Common File Layout (OCFL)Oxford Common File Layout (OCFL)
Oxford Common File Layout (OCFL)Simeon Warner
 
From Open Annotations to W3C Web Annotations (and the impact on IIIF Present...
From Open Annotations to W3C Web Annotations (and the impact on IIIF Present...From Open Annotations to W3C Web Annotations (and the impact on IIIF Present...
From Open Annotations to W3C Web Annotations (and the impact on IIIF Present...Simeon Warner
 
Introduction to the IIIF Presentation API (@SWIB17)
Introduction to the IIIF Presentation API (@SWIB17)Introduction to the IIIF Presentation API (@SWIB17)
Introduction to the IIIF Presentation API (@SWIB17)Simeon Warner
 
Introduction to the International Image Interoperability Framework (IIIF)
Introduction to the International Image Interoperability Framework (IIIF)Introduction to the International Image Interoperability Framework (IIIF)
Introduction to the International Image Interoperability Framework (IIIF)Simeon Warner
 
From Open Access to Open Standards, (Linked) Data and Collaborations
From Open Access to Open Standards, (Linked) Data and CollaborationsFrom Open Access to Open Standards, (Linked) Data and Collaborations
From Open Access to Open Standards, (Linked) Data and CollaborationsSimeon Warner
 
Mind the gap! Reflections on the state of repository data harvesting
Mind the gap! Reflections on the state of repository data harvestingMind the gap! Reflections on the state of repository data harvesting
Mind the gap! Reflections on the state of repository data harvestingSimeon Warner
 
ORCID & other Person iDs
ORCID & other Person iDsORCID & other Person iDs
ORCID & other Person iDsSimeon Warner
 
Who's the Author? Identifier soup - ORCID, ISNI, LC NACO and VIAF
Who's the Author? Identifier soup - ORCID, ISNI, LC NACO and VIAFWho's the Author? Identifier soup - ORCID, ISNI, LC NACO and VIAF
Who's the Author? Identifier soup - ORCID, ISNI, LC NACO and VIAFSimeon Warner
 

Más de Simeon Warner (20)

Questioning Authority Lookup Service: Linking the Data
Questioning Authority Lookup Service: Linking the DataQuestioning Authority Lookup Service: Linking the Data
Questioning Authority Lookup Service: Linking the Data
 
OCFL: A Shared Approach to Preservation Persistence
OCFL: A Shared Approach to Preservation PersistenceOCFL: A Shared Approach to Preservation Persistence
OCFL: A Shared Approach to Preservation Persistence
 
The Oxford Common File Layout: A common approach to digital preservation
The Oxford Common File Layout: A common approach to digital preservationThe Oxford Common File Layout: A common approach to digital preservation
The Oxford Common File Layout: A common approach to digital preservation
 
Welcome to the FOLIO Community
Welcome to the FOLIO CommunityWelcome to the FOLIO Community
Welcome to the FOLIO Community
 
Sinopia & FOLIO: Bridging the gap to linked data cataloging
Sinopia & FOLIO: Bridging the gap to linked data cataloging Sinopia & FOLIO: Bridging the gap to linked data cataloging
Sinopia & FOLIO: Bridging the gap to linked data cataloging
 
FOLIO and Linked Data
FOLIO and Linked DataFOLIO and Linked Data
FOLIO and Linked Data
 
OCFL v1.0
OCFL v1.0OCFL v1.0
OCFL v1.0
 
IIIF Technical Specification Status Update
IIIF Technical Specification Status UpdateIIIF Technical Specification Status Update
IIIF Technical Specification Status Update
 
LKG Editor Dev
LKG Editor DevLKG Editor Dev
LKG Editor Dev
 
Don't bold the field name!
Don't bold the field name!Don't bold the field name!
Don't bold the field name!
 
Samvera and IIIF 2018
Samvera and IIIF 2018Samvera and IIIF 2018
Samvera and IIIF 2018
 
Oxford Common File Layout (OCFL)
Oxford Common File Layout (OCFL)Oxford Common File Layout (OCFL)
Oxford Common File Layout (OCFL)
 
ORCID @ Cornell
ORCID @ CornellORCID @ Cornell
ORCID @ Cornell
 
From Open Annotations to W3C Web Annotations (and the impact on IIIF Present...
From Open Annotations to W3C Web Annotations (and the impact on IIIF Present...From Open Annotations to W3C Web Annotations (and the impact on IIIF Present...
From Open Annotations to W3C Web Annotations (and the impact on IIIF Present...
 
Introduction to the IIIF Presentation API (@SWIB17)
Introduction to the IIIF Presentation API (@SWIB17)Introduction to the IIIF Presentation API (@SWIB17)
Introduction to the IIIF Presentation API (@SWIB17)
 
Introduction to the International Image Interoperability Framework (IIIF)
Introduction to the International Image Interoperability Framework (IIIF)Introduction to the International Image Interoperability Framework (IIIF)
Introduction to the International Image Interoperability Framework (IIIF)
 
From Open Access to Open Standards, (Linked) Data and Collaborations
From Open Access to Open Standards, (Linked) Data and CollaborationsFrom Open Access to Open Standards, (Linked) Data and Collaborations
From Open Access to Open Standards, (Linked) Data and Collaborations
 
Mind the gap! Reflections on the state of repository data harvesting
Mind the gap! Reflections on the state of repository data harvestingMind the gap! Reflections on the state of repository data harvesting
Mind the gap! Reflections on the state of repository data harvesting
 
ORCID & other Person iDs
ORCID & other Person iDsORCID & other Person iDs
ORCID & other Person iDs
 
Who's the Author? Identifier soup - ORCID, ISNI, LC NACO and VIAF
Who's the Author? Identifier soup - ORCID, ISNI, LC NACO and VIAFWho's the Author? Identifier soup - ORCID, ISNI, LC NACO and VIAF
Who's the Author? Identifier soup - ORCID, ISNI, LC NACO and VIAF
 

Último

MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfPatidar M
 
Food processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture honsFood processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture honsManeerUddin
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatYousafMalik24
 
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxMusic 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxleah joy valeriano
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Celine George
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptxmary850239
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 

Último (20)

MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdf
 
Food processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture honsFood processing presentation for bsc agriculture hons
Food processing presentation for bsc agriculture hons
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Earth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice greatEarth Day Presentation wow hello nice great
Earth Day Presentation wow hello nice great
 
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptxMusic 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
Music 9 - 4th quarter - Vocal Music of the Romantic Period.pptx
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 
Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17Field Attribute Index Feature in Odoo 17
Field Attribute Index Feature in Odoo 17
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
Raw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptxRaw materials used in Herbal Cosmetics.pptx
Raw materials used in Herbal Cosmetics.pptx
 
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONTHEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx4.16.24 21st Century Movements for Black Lives.pptx
4.16.24 21st Century Movements for Black Lives.pptx
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 

ResourceSync Semantic Web Data

  • 1. ResourceSync for Semantic Web Data Copying and Synchronization Simeon Warner (Cornell University) http://orcid.org/0000-0002-7970-7855 SWIB13, Hamburg, Germany 2013-11-27
  • 2. Menu 1. A personal spin 2. ResourceSync a. ResourceSync: Problem Perspective & Conceptual Approach b. Motivation & Use Cases c. Framework Walkthrough d. Framework Technical Details e. Implementation 3. ResourceSync and the Semantic Web
  • 4.
  • 5.
  • 6.
  • 7. Linked world – but no data 1. Names for articles, people 2. HTTP to get data 3. (no machine data) 4. Have links to other things
  • 8.
  • 9. Code for RDF/XML and Turtle support contributed to ORCID by Stian Soiland-Reyes
  • 11.
  • 12.
  • 14. ResourceSync: A Web-Based Resource Synchronization Framework These following slides are excerpted from the ResourceSync tutorial. The most recent version of the full tutorial slides is available at http://www.slideshare.net/OpenArchivesInitiative/resourcesync-tutorial #resourcesync ResourceSync is funded by The Sloan Foundation & JISC ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 14
  • 15. OAI Herbert Van de Sompel Martin Klein Robert Sanderson (Los Alamos National Laboratory) Simeon Warner (Cornell University) NISO Todd Carpenter Nettie Lagace University of Oxford Graham Klyne Bernhard Haslhofer (University of Vienna) Michael L. Nelson (Old Dominion University) Lyrasis Peter Murray Carl Lagoze (University of Michigan) ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 15
  • 16. ResourceSync Technical Group LOCKSS Ex Libris Inc. Shlomo Sanders David Rosenthal JISC Paul Walk Richard Jones Stuart Lewis RedHat OCLC Christian Sadilek Library of Congress Jeff Young Kevin Ford ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 16
  • 17. Timeline, Status of Specification(s) • August 2013 o o Release of ResourceSync framework Core specification - Version 0.9.1 Public draft of ResourceSync Archives specification released • September 2013 o Core specification on its way to become an ANSI standard • November 2013 o Internal draft of ResourceSync Notification specification • January 2014 o Public draft of ResourceSync Notification specification • Mid 2014 o Core specification becomes ANSI/NISO standard ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 17
  • 18. Menu 1. A personal spin 2. ResourceSync a. ResourceSync: Problem Perspective & Conceptual Approach b. Motivation & Use Cases c. Framework Walkthrough d. Framework Technical Details e. Implementation 3. ResourceSync and the Semantic Web
  • 19. Synchronize What? • Web resources o things with a URI that can be dereferenced • Focus on needs of research communication and cultural heritage organizations but aim for generality • Small websites/repositories (a few resources) to large repositories/datasets/linked data collections (many millions of resources) • Low change frequency (weeks/months) to high change frequency (seconds) • Synchronization latency and accuracy needs may vary ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 19
  • 20. ResourceSync Problem • Consider: • Source (server) A has resources that change over time: they get created, modified, deleted • Destination (servers) X, Y, and Z leverage (some) resources of Source A. • Problem: • Destinations want to keep in step with the resource changes at Source A • Goal: • Design an approach for resource synchronization aligned with the Web Architecture that has a fair chance of adoption by different communities. • The approach must scale better than recurrent HTTP HEAD/GET on resources. ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 20
  • 21. Destination: Synchronization Needs 1. Baseline synchronization – A destination must be able to perform an initial load or catch-up with a source - avoid out-of-band setup 2. Incremental synchronization – A destination must have some way to keep up-to-date with changes at a source - subject to some latency; minimal: create/update/delete - allow to catch-up after destination has been offline 3. Audit – A destination should be able to determine whether it is synchronized with a source - regarding coverage and accuracy ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 21
  • 22. Didn’t you sell us OAI-PMH? Or... will ResourceSync replace OAI-PMH?  Proven XML metadata transfer protocol  Libraries in a number of programming languages  Widely adopted in our community X X X Predates REST, not “of the web” Not adopted for content transfer Technical issues with sets • Devise a shared solution for data, metadata, linked data? ResourceSync may replace, will likely coexistence ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 22
  • 23. Menu 1. A personal spin 2. ResourceSync a. ResourceSync: Problem Perspective & Conceptual Approach b. Motivation & Use Cases c. Framework Walkthrough d. Framework Technical Details e. Implementation 3. ResourceSync and the Semantic Web
  • 24. Use Cases – The Basics a) b) ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 24
  • 25. Use Cases – The Basics c) d) ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 25
  • 26. Use Cases – The not-so-Basics e) f) ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 26
  • 27. Use Case 1: arXiv Mirroring and Data Sharing • Repository of scholarly articles in physics, mathematics, computer science, etc. • > 880k articles, ~1.5 revisions per article • ~75k new articles per year • metadata, source, PDF • ~3.8M resources • ~2700 updates/day • Support o o Mirroring Sharing ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 27
  • 28. Use Case 2: DBpedia Live Duplication • Average of 2 updates per second • Low latency desirable => need for a push technology ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 28
  • 29. Use Case 2: DBpedia Live Duplication • Daily traffic: o 99% updates o 0.6% deletions o 0.03% creations • LANL experiments with push-based sync ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 29
  • 30. Menu 1. A personal spin 2. ResourceSync a. ResourceSync: Problem Perspective & Conceptual Approach b. Motivation & Use Cases c. Framework Walkthrough d. Framework Technical Details e. Implementation 3. ResourceSync and the Semantic Web
  • 31. Source: Core Synchronization Capabilities P U L L 1. Describing content – publish a list of resources available for synchronization to enable Destinations to perform an initial load or catch-up with a Source 2. Packaging content – bundle resources to enable bulk download by destinations 3. Describing changes – publish a list of resource changes to enable destinations to stay synchronized and decrease latency 4. Packaging changes – bundle resource changes for bulk download by destinations ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 31
  • 32. Source Capability 1: Describing Content In order to advertise the resources that a source wants destinations to know about, it may describe them: o o Publish a Resource List, a list of resource URIs and possibly associated metadata - Destination GETs the Resource List - Destination GETs listed resources by their URI A Resource List describes the state of a set of resources at one point in time (snapshot) ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 32
  • 33. 33
  • 34. 34
  • 35. Source Capability 2: Packaging Content By default, content is transferred in response to a GET issued by a destination against a URI of a source’s resource. But a source may support additional mechanisms: o o Publish a Resource Dump, a document that points to packages of resource representations and necessary metadata - Destination GETs the package - Destination unpacks the package - ZIP format supported A Resource Dump and the packages it points to reflect the state of a set of resources at one point in time (snapshot) ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 35
  • 36. 36
  • 38. Source Capability 3: Describing Changes In order to achieve lower latency and/or greater efficiency, a source may communicate about changes to its resources: o o Publish a Change List, a list of recent change events (created, updated, deleted resource) - Destination acts upon change events, e.g. GETs created/updated resources, removes deleted resources. A Change List pertains to resources that changed in a temporal interval with a start- and an end-date - If a resource changed more than once, it will be listed more than once ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 38
  • 39. 39
  • 40. Destination: Key Processes ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 40
  • 41. Menu 1. A personal spin 2. ResourceSync a. ResourceSync: Problem Perspective & Conceptual Approach b. Motivation & Use Cases c. Framework Walkthrough d. Framework Technical Details e. Implementation 3. ResourceSync and the Semantic Web
  • 42. Many technology options Push DSNotify OAI-PMH rsync Crawl Pull OAI-ORE RDFsync WebDAV Col. Syn. XMPP Atom SWORD Sitemap SPARQLpush SDShare AtomPub RSS PubSubHubbub XMPP ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 42
  • 43. Many technology options Push DSNotify OAI-PMH rsync Crawl Pull OAI-ORE RDFsync WebDAV Col. Syn. XMPP Atom SWORD Sitemap SPARQLpush SDShare AtomPub RSS PubSubHubbub XMPP ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 43
  • 44. • Modular framework allowing selective deployment • Sitemap is the core format throughout the framework o o o Introduce extension elements and attributes: - In ResourceSync namespace (rs:) to accommodate synchronization needs Reuse Sitemap format for all capability documents Utilize Sitemap index format where ResourceSync needed/allowed SWIB13, Hamburg, Germany, 2013-11-27 44
  • 46. ResourceSync Sitemap Extensions <urlset xmlns=http://www.sitemaps.org/schemas/sitemap/0.9 xmlns:rs="http://www.openarchives.org/rs/terms/”> <rs:ln …/> <rs:md …/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:ln …/> <rs:md …/> </url> <url> … </url> Same extensions in </urlset> <sitemapindex> ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 46
  • 47. Resource Metadata Summary Element/Attribute <loc> <lastmod> Description Resource URI (identity) Timestamp of last change Defined by sitemaps sitemaps <changefreq> Expected update frequency sitemaps <rs:md> change encoding hash length path type ResourceSync Change type (Change List & Change Dump Manifest only) ResourceSync HTTP Content-Encoding header value RFC2616 One or more content digests (md5, sha-1, Atom Link Ext. sha-256) HTTP Content-Length header value RFC4287 Path in ZIP package (Dump Manifests only) HTTP Content-Type header value ResourceSync RFC4287 ResourceSync SWIB13, Hamburg, Germany, 2013-11-27
  • 48. Link Relation Summary Relation Use in ResourceSync Defined in rel="alternate" Link from generic to specific URI HTML 5 rel="canonical" Link from specific to generic URI RFC6596 rel="collection" Resource is member of collection RFC6573 rel="contents" Link from dump to manifest rel="describedby" Has metadata HTML4 Protocol for Web Description Resources (POWDER): Description Resources rel="describes" Is metadata for The 'describes' Link Relation Type rel="duplicate" RFC6249 rel=".../rs/terms/patch" Mirror or alternative copy A patch -- efficient change information rel="memento" Link to time-specific URI Memento Internet Draft rel="timegate" Link to timegate Memento Internet Draft rel="via" Provenance chain, came from RFC4287 This specification ResourceSync SWIB13, Hamburg, Germany, 2013-11-27
  • 49. ResourceSync Sitemap Validation • All ResourceSync capability documents are valid according to the Sitemap XML Schema o http://www.sitemaps.org/schemas/sitemap/0.9 • For a more thorough validation use the ResourceSync XML Schema o http://www.openarchives.org/rs/0.9.1/resourcesync.xsd ResourceSync SWIB13, Hamburg, Germany, 2013-11-27
  • 50. Example document: Resource List <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability="resourcelist" at="2013-01-03T09:00:00Z” completed="2013-01-03T09:01:00Z” /> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md hash="md5:1584abdf8ebdc9802ac0c6a7402c03b6" length="8876" type="text/html"/> </url> • Describe Source’s resources subject to synchronization <url> • At one point in time (snapshot) … • Creation can take some time – duration can be </url> conveyed </urlset> • HTTP GET resources ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 50
  • 51. Framework Structure (without possible index documents) http://www.openarchives.org/rs/resourcesync#Structure ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 51
  • 52. Supported Linking Use Cases Provide links to related resources to address specific needs: 1. Mirrored content with multiple download locations 2. Alternate representations of the same content • Resources subject to HTTP content negotiation • Format migration for preservation reasons 3. Patching content rather than replacing it 4. Resources and metadata about resources 5. Prior versions of resources 6. Collection membership of resources 7. Republishing synchronized resources All cases use <rs:ln> element referring to the linked resource ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 52
  • 53. Linking – Alternate Representations – Case 1 <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <url> <loc>http://example.com/res1</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md change=”updated”/> <rs:ln rel="alternate" type="text/html" href="http://example.com/res1.html"/> <rs:ln rel="alternate" type=“application/pdf" href=”http://example.com/res1.pdf"/> </url> </urlset> Canonical URI links to specific URIs ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 53
  • 54. Linking – Alternate Representations – Case 2 <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:rs="http://www.openarchives.org/rs/terms/"> <rs:md capability=”changelist" from="2013-01-02T09:00:00Z” until="2013-01-03T09:00:00Z”/> <url> <loc>http://example.com/res1.html</loc> <lastmod>2013-01-02T13:00:00Z</lastmod> <rs:md change=”updated”/> <rs:ln rel=”canonical” href="http://example.com/res1"/> </url> </urlset> Specific URI links to canonical URI ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 54
  • 55. Source: Notification Capabilities Motivations: reduce synchronization latency, avoid polling • P U • S H 1. Change Notification • Notifies destination about changes to particular resources • e.g., resource A has been updated | created | deleted 2. Framework Notification • Notifies destination about changes to capabilities i.e., their documents • e.g., a Change List has been updated | created | deleted • Also for Capability Lists and Source Description Investigating Pubsubhubbub as transport first, may look at WebSockets later ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 55
  • 57. A R C H I V E S Source: Archival Capabilities The Source may hold on to historical data, for example, to allow Destinations to catch up with events they missed or revisit prior resource states. To this end, the Source can publish archives, i.e. documents that enumerate historical capability documents 1. 2. 3. 4. Resource List Archive Resource Dump Archive Change List Archive Change Dump Archive Re-use same document formats to list archived sets of corresponding documents, discovery entries tie together ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 57
  • 58. Menu 1. A personal spin 2. ResourceSync a. ResourceSync: Problem Perspective & Conceptual Approach b. Motivation & Use Cases c. Framework Walkthrough d. Framework Technical Details e. Implementation 3. ResourceSync and the Semantic Web
  • 59. DSpace support for metadata harvesting use case DSpace Module: https://github.com/CottageLabs/DSpaceResourceSync PHP client: https://github.com/stuartlewis/resync-php http://mydspace.edu/dspace-rs/resource/123456789/7/qdc ResourceSync webapp Item handle Metadata Format ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 59
  • 60. ResourceSync @ arXiv • Use ResourceSync for both mirroring and public data access o efficient updates o ability to do periodic audits o public synchronization capability o reduce admin burden • Start with metadata + source for mirroring use case (doing experiments now) • Open Access use cases require processed PDF also ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 60
  • 61. Python Library and Client • Aim to provide library code implementing all ResourceSync facilities for use in both source and destination implementations o o Designed for python 2.6 (RHEL6) and 2.7 Will not work with python <= 2.5 • Client (resync) supports many destination operations, inspired by the common Unix rsync program • Client also supports some operations that might be useful in a source, such as generation of static Resource Lists, or periodic Change Lists (used in arXiv experiments) • Explorer (resync-explorer) intended to allow easy inspection of a source’s resource sets and capabilities • Developed since ResourceSync v0.5, updated for v0.9.1 http://github.org/resync/resync ResourceSync SWIB13, Hamburg, Germany, 2013-11-27
  • 62. ResourceSync Source Simulator • Python code using Tornado server • Provides random set of resources of different sizes updated at a particular rate • Very useful for testing Destination code http://github.com/resync/simulator ResourceSync SWIB13, Hamburg, Germany, 2013-11-27
  • 63. Menu 1. A personal spin 2. ResourceSync a. ResourceSync: Problem Perspective & Conceptual Approach b. Motivation & Use Cases c. Framework Walkthrough d. Framework Technical Details e. Implementation 3. ResourceSync and the Semantic Web
  • 64. Linked data Fundamentally distributed but local copy often required. Either: MusicBrainz 1. cache Last.FM 2. sync local copy... BBC DBpedia GeoNames others... • Many ad-hoc solutions for local copy 64
  • 65. How do you get your semantic data? • @edsu -> @LibSkrat: “not at http://id.loc.gov no; someone could download the triples and create their own though” • Philipp Zumstein – “copy the RDF/XML files” • Valeria Pesce – “harvest XML and CSV, then map to extended VIVO ontology” • ... • Poll on semantic data storage: triple store? files? RDB? other? – results: ~6 triple, ~6 files, ~1 RDB, ~1 MongoDB
  • 66. Semantic data synchronization • Is your data nice linked data? URIs that resolve to other documents, etc. – everything is a web resource so good match for ResourceSync – maybe the web already provides adequate access? • Are you able to tell what has changed? – in most triple stores there is no timestamp so providing subsets of changed data might be hard • Look at four scenarios...
  • 67. Linked data Sematic data on the web – great match for ResourceSync Consider a linked data system has some convenient way to generate and/or keep track of fixity information (datestamps, hashes, etc.) for all of its resource representations, then this may be an effective way to synchronize with ResourceSync. • Usual ResourceSync mechanisms including Resource Lists, Change Lists, Dumps, Notifications and Archives all applicable. • Complications with triple store – How to generate fixity information? – Cost of generating set of self-contained representations (e.g. concise bounded description) may be high
  • 68. Service level notification Image a set of RDF data updated periodically, goal is just to let consumers know that changes have been made • Perhaps many sets or subsets provided • Need to give service a URI which is then listed in Resource List etc.
  • 69. Dumps as resources to sync • Might have dump that is in any format, it is just a resource on the web • Almost trivial but fits very cleanly in framework • Would work well with framework notification • Use link relation might also be used to indicate sequence if there are a set of dumps from different times Note: Somewhat different from a ResourceSync Resource Dump (or Change Dump) which is something where the data is represented as resources to be synchronized
  • 70. Diffs or patches for RDF data Open question: it might be possible to use RDF patching mechanisms (perhaps JSON-PATCH with JSON-LD) to provide efficient updates of RDF datasets • Trivial for a dataset with no blank nodes, • Diffs progressively more difficult and less efficient if there are many blank nodes • Particularly useful/efficient for large datasets with small changes ResourceSync provides mechanism to link any patch format and file, and relate to the resource patched
  • 71. 71
  • 72. Pointers • Specification http://www.openarchives.org/rs/ http://www.openarchives.org/rs/resourcesync http://www.openarchives.org/rs/archives • List for public comment https://groups.google.com/d/forum/resourcesync • Client and simulator code http://github.org/resync/resync http://github.org/resync/simulator ResourceSync SWIB13, Hamburg, Germany, 2013-11-27 72

Notas del editor

  1. Somewhere around 1996,1997 I worked in an area of physics which seemed to have a near perfect information environment. All new articles in the Lattice Models sub-field of High Energy Physics appear on arXiv.org, then xxx.lanl.gov, about 5-10 per day.
  2. Would like to see such an experience extended to the wider information world, for all subjects, richer objects included.
  3. 2:33OAI-PMH was introduced over 12 years ago (before the first JCDL, before OR was even imagined)
  4. XML &lt;-&gt; OAI-PMHlarge data begs diff question
  5. &lt;0.1% change per day.
  6. Semantic web version of wikipedia; want mirror to provide reliable basis for local services
  7. Semantic web version of wikipedia; want mirror to provide reliable basis for local services
  8. Top line – just metadata about resources, destination uses GET to get them (duh)Bottom line – packaged content =&gt; fewer round trips
  9. Rsyncetc just reference; push vs pull -&gt; both; many other parts
  10. Rsyncetc just reference; push vs pull -&gt; both; many other parts
  11. Add: rel=“contents”rel=“archives”
  12. Thanks to Sitemap folks for fixing their schema to allow the claimed extensibility
  13. All the blue boxes are sitemaps
  14. Test site, has subsets of arXiv and even complete source plus metadata (at present not up to date with 0.9)
  15. Ironic perhaps that while linked data is fundamentally distributed, many applications require local copies. Ad-hoc approaches to bulk copy