SlideShare una empresa de Scribd logo
1 de 48
Data Archiving and Networked Services

Riding the Wave and the Scholarly
Archive of the Future
Thinking in Progress by:
Andrew Treloar
DANS Visiting Fellow
ANDS Director of Technology
Herbert van de Sompel
DANS Visiting Fellow
LANL Scientist
#rtwsaf
DANS is an institute of KNAW and NWO
Structure presentation
• Where we are today
• Pointers to the future
• Characterising that future
– Fundamental concepts
– Observations about archiving
– Diagramming the infrastructure

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Let’s go on a journey
• Republic of Letters
• System of Journals
• Web of Objects

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Functions of Research Communication
Rosendaal and Geurts (1997)
• Registration: Allows claims of precedence for a
scholarly finding
• Certification: Establishes validity of claim

• Awareness: Allows actors in the system to remain
aware of new claims
• Archiving: Preserves the scholarly record
January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
System of Journals
• Registration
– submission of manuscript

• Certification
– peer-review (pre-publication)
– commentary (post-publication)

• Awareness
– discovery services

• Archiving
– libraries (print)
– publishers (electronic)
– special purpose organisations (e.g. Portico)
January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Pointers to the future
“the future is already here – it’s
just not very evenly distributed”
William Gibson, NPR interview

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Registration: BioRxiv

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Registration: ideacite

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Registration: Github

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Registration: WikiPathways

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Registration: NeuroLex

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Registration: Nanopublications

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Registration: Observations
•
•
•
•

Decoupling registration from certification
Timestamping, versioning
Registration of various types of objects
Machines as creators and contributors

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Certification: PubMed Commons

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Certification: PubPeer

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Certification: ZooUniverse

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Certification: Slideshare

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Certification: Project FeederWatch

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Certification: Observations
•
•
•
•

Peer-review decoupled from publication process
Certification of various types of objects
Machines validating
Social endorsement

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Awareness: NARCIS

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Awareness: myExperiment

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Awareness: eLabNotebook RSS

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Awareness: Twitter

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Awareness: CrossRef Prospect

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Awareness: Observations
•
•
•
•

Awareness for various types of objects
Real time awareness
Awareness support targeted at machines
Awareness through social media

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Archiving: CLOCKSS

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Archiving: DANS Easy

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Archiving: Australian Antarctic Data Centre

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Archiving: perma.cc

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Archiving: EU Trusted Digital Repositories

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Archiving: Observations
•
•
•
•

Archiving for various types of objects
Distributed archives
Archival consortia
Audit for trustworthiness

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Characterising the future
Hidden

Research Process

Visible

Fixed

Nature of object

Varying

Atomic

Atomicity of object

Compound

Discrete

Process of making public

Continuous

Delayed

S
peed of communication

Publication
+data proxies

Communicated object

Formal

Nature of process

Instant
Publication +
linked data +
linked models
Informal

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Fundamental changes
• The research process (objects, social dimension)
is becoming more exposed

• Articles, books are no longer the only relevant
objects for research communication
• Objects are no longer static
• Machines are joining humans as (co-)creators
and consumers of research objects
January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Web of Objects
• Registration
– Recording of a wide variety of objects, versions of objects

• Certification
– Content/Form
– Human/Machine

• Awareness
– Real-time
– Social
– Variety of objects

• Archiving
– Archiving a wide variety of objects
– Trusted archives
January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Archiving: Observation 1

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
System of Journals: Publication

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
System of Journals: Archiving

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Web of Objects: Registration

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Web of Objects: Archiving?

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Need to do better than this

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Not just citation relationships

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Archiving: Observation 2

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Web platforms for scholarship
• Common web platforms are increasingly used for
scholarship
– Wikis, GitHub, Twitter, Wordpress, etc.

• Many of these have desirable characteristics:
– Versioning
– Timestamping
– Social embedding

• Still, they record rather than archive
January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Recording not Archiving
“GitHub reserves the right at any time and from time to
time to modify or discontinue, temporarily or
permanently, the Service (or any part thereof) with or
without notice.”
“GitHub does not warrant that (i) the service will meet
your specific requirements, (ii) the service will be
uninterrupted, timely, secure, or error-free, (iii) the
results that may be obtained from the use of the service
will be accurate or reliable, (iv) the quality of any
products, services, information, or other material
purchased or obtained by you through the service will
meet your expectations, and (v) any errors in the
Service will be corrected.”
January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Recording isn’t Archiving
Recording

Archiving

Short-term

Longer-term

No guarantees

Read/Write

Try to provide
guarantees
Read

Scholarly Process

Scholarly Record

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Infrastructure implications
• This infrastructure needs to include
– use of common platforms to support recording
– availability of specialist platforms to support archiving

• We need an archiving infrastructure that
underpins research activity that is
–
–
–
–
–

trusted
sustainable
distributed
interoperable
standards-based

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp
Implications
• Need organizational, technical,
curational interfaces between recording
and archiving platforms
• Need organizational, technical
interfaces across archiving platforms

January 20, 2014

CC-BY-SA, @atreloar and @hvdsomp

Más contenido relacionado

Similar a Scholarly archive-of-the-future

Identifying The Benefit of Linked Data
Identifying The Benefit of Linked DataIdentifying The Benefit of Linked Data
Identifying The Benefit of Linked Data
Richard Wallis
 
Moving from an IR to a CRIS, the why & how
Moving from an IR to a CRIS, the why & howMoving from an IR to a CRIS, the why & how
Moving from an IR to a CRIS, the why & how
David T Palmer
 
20140113 q uchemxseerseminar
20140113 q uchemxseerseminar20140113 q uchemxseerseminar
20140113 q uchemxseerseminar
TahseenaM
 
Railroad Modeling at Hadoop Scale
Railroad Modeling at Hadoop ScaleRailroad Modeling at Hadoop Scale
Railroad Modeling at Hadoop Scale
DataWorks Summit
 

Similar a Scholarly archive-of-the-future (20)

HighWire Next Gen: Collection Management and Extending Publisher Reach into t...
HighWire Next Gen: Collection Management and Extending Publisher Reach into t...HighWire Next Gen: Collection Management and Extending Publisher Reach into t...
HighWire Next Gen: Collection Management and Extending Publisher Reach into t...
 
Growth Hacking 101 for Research Networking (for VIVO Implementation & Dev call)
Growth Hacking 101 for Research Networking (for VIVO Implementation & Dev call)Growth Hacking 101 for Research Networking (for VIVO Implementation & Dev call)
Growth Hacking 101 for Research Networking (for VIVO Implementation & Dev call)
 
Persistent Identifiers and the Web: The Need for an Unambiguous Mapping
Persistent Identifiers and the Web: The Need for an Unambiguous MappingPersistent Identifiers and the Web: The Need for an Unambiguous Mapping
Persistent Identifiers and the Web: The Need for an Unambiguous Mapping
 
SURE2 Statistics dashboard - end results
SURE2 Statistics dashboard - end resultsSURE2 Statistics dashboard - end results
SURE2 Statistics dashboard - end results
 
Identifying The Benefit of Linked Data
Identifying The Benefit of Linked DataIdentifying The Benefit of Linked Data
Identifying The Benefit of Linked Data
 
Ensuring the Integrity (& Continuity) of Our Record of Scholarship
Ensuring the Integrity (& Continuity) of Our Record of ScholarshipEnsuring the Integrity (& Continuity) of Our Record of Scholarship
Ensuring the Integrity (& Continuity) of Our Record of Scholarship
 
WPCampus - Sheridan CCIT Case Study
WPCampus - Sheridan CCIT Case StudyWPCampus - Sheridan CCIT Case Study
WPCampus - Sheridan CCIT Case Study
 
The life-sciences as a pathfinder in data-intensive research practice
The life-sciences as a pathfinder in data-intensive research practiceThe life-sciences as a pathfinder in data-intensive research practice
The life-sciences as a pathfinder in data-intensive research practice
 
II-SDV 2016 - QWAM Content Intelligence
II-SDV 2016 - QWAM Content IntelligenceII-SDV 2016 - QWAM Content Intelligence
II-SDV 2016 - QWAM Content Intelligence
 
2014 Ceph NYLUG Talk
2014 Ceph NYLUG Talk2014 Ceph NYLUG Talk
2014 Ceph NYLUG Talk
 
The workflows for the ingest of digital objects into a repository/digital l...
The workflows for the ingest of  digital objects into a repository/digital l...The workflows for the ingest of  digital objects into a repository/digital l...
The workflows for the ingest of digital objects into a repository/digital l...
 
Research Data Publishing
Research Data PublishingResearch Data Publishing
Research Data Publishing
 
Linked Open Data and The Digital Archaeological Workflow at the Swedish Natio...
Linked Open Data and The Digital Archaeological Workflow at the Swedish Natio...Linked Open Data and The Digital Archaeological Workflow at the Swedish Natio...
Linked Open Data and The Digital Archaeological Workflow at the Swedish Natio...
 
Bits of Research
Bits of ResearchBits of Research
Bits of Research
 
Moving from an IR to a CRIS, the why & how
Moving from an IR to a CRIS, the why & howMoving from an IR to a CRIS, the why & how
Moving from an IR to a CRIS, the why & how
 
Why we need oa infrastructure - STM Association Beyond Open Access Seminar
Why we need oa infrastructure - STM Association Beyond Open Access SeminarWhy we need oa infrastructure - STM Association Beyond Open Access Seminar
Why we need oa infrastructure - STM Association Beyond Open Access Seminar
 
20140113 q uchemxseerseminar
20140113 q uchemxseerseminar20140113 q uchemxseerseminar
20140113 q uchemxseerseminar
 
Railroad Modeling at Hadoop Scale
Railroad Modeling at Hadoop ScaleRailroad Modeling at Hadoop Scale
Railroad Modeling at Hadoop Scale
 
Collaboration and Cash: Web Archiving Incentive Awards
Collaboration and Cash: Web Archiving Incentive AwardsCollaboration and Cash: Web Archiving Incentive Awards
Collaboration and Cash: Web Archiving Incentive Awards
 
PIDs for cultural heritage Flanders
PIDs for cultural heritage FlandersPIDs for cultural heritage Flanders
PIDs for cultural heritage Flanders
 

Más de Andrew Treloar

Más de Andrew Treloar (19)

Building a National Research Data Commons – Transforming Scholarship Through ...
Building a National Research Data Commons – Transforming Scholarship Through ...Building a National Research Data Commons – Transforming Scholarship Through ...
Building a National Research Data Commons – Transforming Scholarship Through ...
 
Provenance in Support of the ANDS Four Transformations
Provenance in Support of the ANDS Four TransformationsProvenance in Support of the ANDS Four Transformations
Provenance in Support of the ANDS Four Transformations
 
ANDS Applications Program: Building Tools to Facilitate Data Reuse
ANDS Applications Program: Building Tools to Facilitate Data ReuseANDS Applications Program: Building Tools to Facilitate Data Reuse
ANDS Applications Program: Building Tools to Facilitate Data Reuse
 
Instutional repositories and data
Instutional repositories and dataInstutional repositories and data
Instutional repositories and data
 
Closing comments at #iPres 2014 conference
Closing comments at #iPres 2014 conferenceClosing comments at #iPres 2014 conference
Closing comments at #iPres 2014 conference
 
The universe of identifiers and how ANDS is using them
The universe of identifiers and how ANDS is using themThe universe of identifiers and how ANDS is using them
The universe of identifiers and how ANDS is using them
 
Adding value to researchers' data
Adding value to researchers' dataAdding value to researchers' data
Adding value to researchers' data
 
Data Infrastructure and the Scholarly Ecosystem of the Future
Data Infrastructure and the Scholarly Ecosystem of the FutureData Infrastructure and the Scholarly Ecosystem of the Future
Data Infrastructure and the Scholarly Ecosystem of the Future
 
Research data and the ANDS agenda in Australia
Research data and the ANDS agenda in AustraliaResearch data and the ANDS agenda in Australia
Research data and the ANDS agenda in Australia
 
Data drives decisions
Data drives decisionsData drives decisions
Data drives decisions
 
Building on the Atlas (of Living Australia)
Building on the Atlas (of Living Australia)Building on the Atlas (of Living Australia)
Building on the Atlas (of Living Australia)
 
Journal literature size in the context of the LHC data
Journal literature size in the context of the LHC dataJournal literature size in the context of the LHC data
Journal literature size in the context of the LHC data
 
Seeking serendipity
Seeking serendipitySeeking serendipity
Seeking serendipity
 
Research data ecology
Research data ecologyResearch data ecology
Research data ecology
 
From Data to Data: One version of a History of Scholarly Communication
From Data to Data: One version of a History of Scholarly CommunicationFrom Data to Data: One version of a History of Scholarly Communication
From Data to Data: One version of a History of Scholarly Communication
 
Data management: international challenges, national infrastructure, and insti...
Data management: international challenges, national infrastructure, and insti...Data management: international challenges, national infrastructure, and insti...
Data management: international challenges, national infrastructure, and insti...
 
The Past, Present and Future of data
The Past, Present and Future of dataThe Past, Present and Future of data
The Past, Present and Future of data
 
Data, librarians, and services
Data, librarians, and servicesData, librarians, and services
Data, librarians, and services
 
Ands National Identifier Solution
Ands National Identifier SolutionAnds National Identifier Solution
Ands National Identifier Solution
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Último (20)

Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 

Scholarly archive-of-the-future

Notas del editor

  1. Content: Multiple sources checking the validity/classification of data
  2. Content: Multiple sources checking the validity/classification of data