SlideShare una empresa de Scribd logo
1 de 18
CASPAR - Preservation Methodology
 Validation (1) testbed demos (2) Audit
and Certification and (3) Provenance and
                Authenticity

     Stephen Rankin (STFC), Matt Dunckley
      (STFC), Brian Mcilwrath (STFC), Esther
      Conway (STFC), David Giaretta (STFC),
                 CASPAR & DCC
Outline

• RepInfo Network Example (Again).
• Validation with RepInfo.
RepInfo Network Example
Validation with RepInfo

• Can a future user use the RepInfo and
  RepInfo network to reuse the data?
• Documents, software etc as RepInfo
  difficult to validate against the data.
  Quality of RepInfo.
What do we mean by reuse the data?
What do we mean by reuse the data?

• Simplify the problem!
• Go from the data bits to a simple Information
  Object
• This means reading the data with the Structure
  RepInfo and applying the Semantics RepInfo
• Feed Information Object into NEW software.
RepInfo for a Hot Cup of Tea
Creation of new Formal Representation
                   Information
•   Curating existing RepInfo is OK and necessary, but we can do
    more…
•   Typically documents that describe the structure and semantics do
    not allow easy validation.
•   Information in documents typically not machine readable.
•   Semantics is usually poorly linked to the data and data values.
•   The types of simple data objects (data types) are not explicitly
    defined.
•   Data will have to be read into future software if it is going to be
    reused.
Formal Descriptions of Structure and
               Semantics
• Digital data is composed of bits which
  somehow have to be mapped to primitive
  software data types (numbers, strings etc)
• Numbers etc need to be assigned simple
  semantics, Name, Description, Unites etc.
Formal Descriptions of Structure
• CNES EAST tools (http://east.cnes.f), OASIS, EAST C
  Library (reference implementation).
• Also DEBAT (BEST Tools) http://debat.c-s.fr/
• Data Request Broker (DRB) - http://www.gael.fr/drb/site/
• JNI Wrapper for EAST C Library in our SVN repository
  (jnieast).
• Interfaces for a more general data description language
  structure and semantics API in our SVN (DSSIL).
  http://developers.casparpreserves.eu:8080/
Formal Descriptions of Structure (Tool
            Examples)
Formal Descriptions of Structure (Examples)
Formal Descriptions of Semantics
DEDSL Abstract, PVL, and XML(DTD) syntax for defining some simple
   data semantics.
• Only a small number of required attributes for a given data structure,
   NAME, DEFINITION, UNITS (conditional), ENTITY_TYPE
   (conditional), ENUMERATION_VALUES (conditional), TEXT_SIZE
   (conditional).
• You can define your own attributes.
• You can reuse definitions from other dictionaries.
Link the data structures to the semantics via the EAST access path
   or an XPATH, i.e. define a new attribute – EAST_PATH (OASIS
   tool does this).
Data Object Virtualisation
• Just talked about simple semantics for data values.
• But data is typically a set of objects.
• Table is composed of columns and columns are
  composed of values.
• Semantics associated with the columns and with the
  table.
• You may also have collections of tables that are all
  related, and the relationships also have semantics.
• Current working is on a collections interface.
Table Data Example
Image (GRID) Data Example
CASPAR/DCC RepInfo Tools Training
                Material
•   Video tutorials of using the RepInfo GUI tool
•   Documentation tutorial for using the RepInfo GUI tool.
•   Brain Training Game for RepInfo
•   Video tutorial of using RepInfo creation tools (creating structure and
    semantics).
•   Packaging tutorial.
•   More under construction – Programming guides etc.
•   All very much under development but a preview for the material is
    available at:
    http://twiki.dcc.rl.ac.uk/bin/view/Main/TheStoryPlanAndPresentation
•   Or we can come to you or you can come to us for a tutorial on our
    tools. You just need to ask!

Más contenido relacionado

Destacado

Destacado (11)

Eng Kamran zfp
Eng Kamran zfpEng Kamran zfp
Eng Kamran zfp
 
General slides
General slidesGeneral slides
General slides
 
OPTIMAL ECONOMIC LOAD DISPATCH USING FUZZY LOGIC & GENETIC ALGORITHMS
OPTIMAL ECONOMIC LOAD DISPATCH USING FUZZY LOGIC & GENETIC ALGORITHMSOPTIMAL ECONOMIC LOAD DISPATCH USING FUZZY LOGIC & GENETIC ALGORITHMS
OPTIMAL ECONOMIC LOAD DISPATCH USING FUZZY LOGIC & GENETIC ALGORITHMS
 
Mrsheep
MrsheepMrsheep
Mrsheep
 
Brochure scrapper trap - control plus
Brochure   scrapper trap - control plusBrochure   scrapper trap - control plus
Brochure scrapper trap - control plus
 
CV
CVCV
CV
 
Presentation Roy's Maritime (2)
Presentation Roy's Maritime (2)Presentation Roy's Maritime (2)
Presentation Roy's Maritime (2)
 
Cockett Marine Oil Platts Final_190513
Cockett Marine Oil Platts Final_190513Cockett Marine Oil Platts Final_190513
Cockett Marine Oil Platts Final_190513
 
Delta Company Profile
Delta Company ProfileDelta Company Profile
Delta Company Profile
 
Spinning of worsted yarn
Spinning of worsted yarnSpinning of worsted yarn
Spinning of worsted yarn
 
Erp presentation
Erp presentationErp presentation
Erp presentation
 

Similar a Caspar Preservation Methodology Steve Renkin

Similar a Caspar Preservation Methodology Steve Renkin (20)

Representation Information Steve Rankin
Representation Information Steve RankinRepresentation Information Steve Rankin
Representation Information Steve Rankin
 
Best Practices for Building and Deploying Data Pipelines in Apache Spark
Best Practices for Building and Deploying Data Pipelines in Apache SparkBest Practices for Building and Deploying Data Pipelines in Apache Spark
Best Practices for Building and Deploying Data Pipelines in Apache Spark
 
Tutorial On Database Management System
Tutorial On Database Management SystemTutorial On Database Management System
Tutorial On Database Management System
 
PROGRAMMING USING C#.NET SARASWATHI RAMALINGAM
PROGRAMMING USING C#.NET SARASWATHI RAMALINGAMPROGRAMMING USING C#.NET SARASWATHI RAMALINGAM
PROGRAMMING USING C#.NET SARASWATHI RAMALINGAM
 
Rapidly Building Data Driven Web Pages with Dynamic ADO.NET
Rapidly Building Data Driven Web Pages with Dynamic ADO.NETRapidly Building Data Driven Web Pages with Dynamic ADO.NET
Rapidly Building Data Driven Web Pages with Dynamic ADO.NET
 
Introduction To R
Introduction To RIntroduction To R
Introduction To R
 
Mr bi
Mr biMr bi
Mr bi
 
Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist Essential Data Engineering for Data Scientist
Essential Data Engineering for Data Scientist
 
Dev381.Pp
Dev381.PpDev381.Pp
Dev381.Pp
 
Data encoding and Metadata for Streams
Data encoding and Metadata for StreamsData encoding and Metadata for Streams
Data encoding and Metadata for Streams
 
Integration
IntegrationIntegration
Integration
 
Overview of running R in the Oracle Database
Overview of running R in the Oracle DatabaseOverview of running R in the Oracle Database
Overview of running R in the Oracle Database
 
Database Survival Guide: Exploratory Webcast
Database Survival Guide: Exploratory WebcastDatabase Survival Guide: Exploratory Webcast
Database Survival Guide: Exploratory Webcast
 
The Relevance of the Apache Solr Semantic Knowledge Graph
The Relevance of the Apache Solr Semantic Knowledge GraphThe Relevance of the Apache Solr Semantic Knowledge Graph
The Relevance of the Apache Solr Semantic Knowledge Graph
 
R tutorial
R tutorialR tutorial
R tutorial
 
NHibernate
NHibernateNHibernate
NHibernate
 
Defense Against the Dark Arts: Protecting Your Data from ORMs
Defense Against the Dark Arts: Protecting Your Data from ORMsDefense Against the Dark Arts: Protecting Your Data from ORMs
Defense Against the Dark Arts: Protecting Your Data from ORMs
 
The Dirty Work -- Why Data Must Be Reconciled
The Dirty Work -- Why Data Must Be ReconciledThe Dirty Work -- Why Data Must Be Reconciled
The Dirty Work -- Why Data Must Be Reconciled
 
DATA MINING USING R (1).pptx
DATA MINING USING R (1).pptxDATA MINING USING R (1).pptx
DATA MINING USING R (1).pptx
 
R Programming - part 1.pdf
R Programming - part 1.pdfR Programming - part 1.pdf
R Programming - part 1.pdf
 

Más de DigitalPreservationEurope

Más de DigitalPreservationEurope (20)

Infrastructure Training Session
Infrastructure Training SessionInfrastructure Training Session
Infrastructure Training Session
 
Drm Training Session
Drm Training SessionDrm Training Session
Drm Training Session
 
2009 Barcelona Wepreserve Nestor
2009 Barcelona Wepreserve Nestor2009 Barcelona Wepreserve Nestor
2009 Barcelona Wepreserve Nestor
 
Trusted Repositories
Trusted RepositoriesTrusted Repositories
Trusted Repositories
 
Preservation Metadata
Preservation MetadataPreservation Metadata
Preservation Metadata
 
An Introduction to Digital Preservation
An Introduction to Digital PreservationAn Introduction to Digital Preservation
An Introduction to Digital Preservation
 
Introduction to Planets
Introduction to PlanetsIntroduction to Planets
Introduction to Planets
 
Digital Preservation Process: Preparation and Requirements
Digital Preservation Process: Preparation and RequirementsDigital Preservation Process: Preparation and Requirements
Digital Preservation Process: Preparation and Requirements
 
The Planets Preservation Planning workflow
The Planets Preservation Planning workflowThe Planets Preservation Planning workflow
The Planets Preservation Planning workflow
 
Building A Sustainable Model for Digital Preservation Services, Clive Billenn...
Building A Sustainable Model for Digital Preservation Services, Clive Billenn...Building A Sustainable Model for Digital Preservation Services, Clive Billenn...
Building A Sustainable Model for Digital Preservation Services, Clive Billenn...
 
Preservation Metadata, Michael Day, DCC
Preservation Metadata, Michael Day, DCCPreservation Metadata, Michael Day, DCC
Preservation Metadata, Michael Day, DCC
 
PLATTER - Jan Hutar
PLATTER - Jan HutarPLATTER - Jan Hutar
PLATTER - Jan Hutar
 
Sustainability Clive Billenness
Sustainability Clive  BillennessSustainability Clive  Billenness
Sustainability Clive Billenness
 
Significant Characteristics In Planets Manfred Thaller
Significant Characteristics In Planets Manfred ThallerSignificant Characteristics In Planets Manfred Thaller
Significant Characteristics In Planets Manfred Thaller
 
Shaman Project Hemmje
Shaman Project  HemmjeShaman Project  Hemmje
Shaman Project Hemmje
 
Scalable Services For Digital Preservation Ross King
Scalable Services For Digital Preservation Ross KingScalable Services For Digital Preservation Ross King
Scalable Services For Digital Preservation Ross King
 
Risks Benefits And Motivations Seamus Ross
Risks Benefits And Motivations Seamus RossRisks Benefits And Motivations Seamus Ross
Risks Benefits And Motivations Seamus Ross
 
Preservation Challenge Radioactive Waste Ian Upshall
Preservation Challenge Radioactive Waste Ian UpshallPreservation Challenge Radioactive Waste Ian Upshall
Preservation Challenge Radioactive Waste Ian Upshall
 
Preservation And Reuse In High Energy Physics Salvatore Mele
Preservation And Reuse In High Energy Physics Salvatore MelePreservation And Reuse In High Energy Physics Salvatore Mele
Preservation And Reuse In High Energy Physics Salvatore Mele
 
Platter Colin Rosenthal
Platter Colin RosenthalPlatter Colin Rosenthal
Platter Colin Rosenthal
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

Caspar Preservation Methodology Steve Renkin

  • 1. CASPAR - Preservation Methodology Validation (1) testbed demos (2) Audit and Certification and (3) Provenance and Authenticity Stephen Rankin (STFC), Matt Dunckley (STFC), Brian Mcilwrath (STFC), Esther Conway (STFC), David Giaretta (STFC), CASPAR & DCC
  • 2. Outline • RepInfo Network Example (Again). • Validation with RepInfo.
  • 4. Validation with RepInfo • Can a future user use the RepInfo and RepInfo network to reuse the data? • Documents, software etc as RepInfo difficult to validate against the data. Quality of RepInfo. What do we mean by reuse the data?
  • 5. What do we mean by reuse the data? • Simplify the problem! • Go from the data bits to a simple Information Object • This means reading the data with the Structure RepInfo and applying the Semantics RepInfo • Feed Information Object into NEW software.
  • 6. RepInfo for a Hot Cup of Tea
  • 7. Creation of new Formal Representation Information • Curating existing RepInfo is OK and necessary, but we can do more… • Typically documents that describe the structure and semantics do not allow easy validation. • Information in documents typically not machine readable. • Semantics is usually poorly linked to the data and data values. • The types of simple data objects (data types) are not explicitly defined. • Data will have to be read into future software if it is going to be reused.
  • 8. Formal Descriptions of Structure and Semantics • Digital data is composed of bits which somehow have to be mapped to primitive software data types (numbers, strings etc) • Numbers etc need to be assigned simple semantics, Name, Description, Unites etc.
  • 9. Formal Descriptions of Structure • CNES EAST tools (http://east.cnes.f), OASIS, EAST C Library (reference implementation). • Also DEBAT (BEST Tools) http://debat.c-s.fr/ • Data Request Broker (DRB) - http://www.gael.fr/drb/site/ • JNI Wrapper for EAST C Library in our SVN repository (jnieast). • Interfaces for a more general data description language structure and semantics API in our SVN (DSSIL). http://developers.casparpreserves.eu:8080/
  • 10.
  • 11.
  • 12. Formal Descriptions of Structure (Tool Examples)
  • 13. Formal Descriptions of Structure (Examples)
  • 14. Formal Descriptions of Semantics DEDSL Abstract, PVL, and XML(DTD) syntax for defining some simple data semantics. • Only a small number of required attributes for a given data structure, NAME, DEFINITION, UNITS (conditional), ENTITY_TYPE (conditional), ENUMERATION_VALUES (conditional), TEXT_SIZE (conditional). • You can define your own attributes. • You can reuse definitions from other dictionaries. Link the data structures to the semantics via the EAST access path or an XPATH, i.e. define a new attribute – EAST_PATH (OASIS tool does this).
  • 15. Data Object Virtualisation • Just talked about simple semantics for data values. • But data is typically a set of objects. • Table is composed of columns and columns are composed of values. • Semantics associated with the columns and with the table. • You may also have collections of tables that are all related, and the relationships also have semantics. • Current working is on a collections interface.
  • 17. Image (GRID) Data Example
  • 18. CASPAR/DCC RepInfo Tools Training Material • Video tutorials of using the RepInfo GUI tool • Documentation tutorial for using the RepInfo GUI tool. • Brain Training Game for RepInfo • Video tutorial of using RepInfo creation tools (creating structure and semantics). • Packaging tutorial. • More under construction – Programming guides etc. • All very much under development but a preview for the material is available at: http://twiki.dcc.rl.ac.uk/bin/view/Main/TheStoryPlanAndPresentation • Or we can come to you or you can come to us for a tutorial on our tools. You just need to ask!