SlideShare una empresa de Scribd logo
1 de 31
FAIR Workflows and
Research Objects get a Workout
Carole Goble
The University of Manchester, UK
carole.goble@manchester.ac.uk
DataVerse Community Conference 2021, 15th June 2021
EOSC-Life pan-national data & method thematic commons for
bioscience data and methods
Using and sharing data, tools and workflows in the cloud
Infrastructure Zoo
Flows around a Federated & Diverse System
1466 data repositories / archives
916 data format and metadata
standards*
Not including the institutional or
national repositories like
DataVerse
https://fairsharing.org/ accessed May 2021
From compounds to clinical trials
Primary data - Secondary use
Infrastructure Zoo
Flows around a Federated & Diverse System
https://fairsharing.org/ accessed May 2021
Community domain enclaves
fragmented resources
flow across platforms & sovereignties
Workflows as an entry point and
integration mechanism
Legacy
• data repositories & data platforms
• processing and workflow
platforms
CryoEM Image Analysis Metagenomic Pipelines Drug Discovery
Quality control
Replication
Scrutiny
Shared know-how
Repetition
SARS-CoV-2 pre-processing, monitoring, analysis
https://elixir-europe.org/news/covid-19-variants-galaxy
Beyond Data:ComputationalWorkflows as method objects
to be shared, ported and reused & repurposed
Multi-step
Leverage third party codes
Scalable processing of data
Transparent research
Computational Workflows
Specification
description
Software
Execution
A special kind of software
Separation of the workflow specification from its execution
Precise description of a procedure: multi-
step process coordinated by input/output
data relationships (data types).
Execution of computational
processes (run a code, invoke a
service…).
Data is consumed and produced by
each step.
Beyond Data:ComputationalWorkflows as method objects
to be shared, ported and reused & repurposed
Multi-step
Leverage third party codes
Scalable processing of data
Transparent research
Computational Workflows
<my scripts>
A Zoo of Workflow Systems and “systems”*
Native repositories
*https://s.apache.org/existing-workflow-systems
EMBL-EBI MGnify
Metagenomics
pipelines
Command line tools
Sub-Workflows
Containers
Beyond Data: Multi-part Research Objects
dependencies and associates scattered across repositories and within repositories
made at different times by different people
Workflow itself Workflow associated Objects
Specification
descriptions
Parameters
Input
Datasets
Output
Datasets
Runtime details & Provenance
Documentation
Bind to Dependencies
- Containers
- Codes
- Sub-workflows
Bind to particular test engines
Publications
Image
Other workflows
Sub workflows
Software
Execution
Inputs and outputs
Author
Beyond Data:ComputationalWorkflows as multi-part method objects
to be shared, ported and reused & repurposed
Services for FAIRWorkflows
• Describe workflows with PIDs and metadata
• Flow: Move workflows between services and
platforms
• Parts: Package (scattered) objects linked
together by context (metadata files with their objects)
Honouring
• the legacy and diverse ecosystem
• buy-in from platforms
Be KISSy
• practical and developer friendly standards,
and webby mechanisms
• extensible openendedness – unknown
unknowns & diversity….
Workflow
Registry
Workflow
Systems
Repos Containers Deploys
Testing
Monitoring
Open Registry forWorkflows
Perpetual Development in the open by an open community
https://workflowhub.eu
Towards FAIR workflows and FAIR registry
• Find and AccessWorkflows
– Workflows may remain in their native repositories in
their native form. Or can deposit.
– Register (push) / Harvest (pull)
• Workflows interoperability and reusability
– Using metadata standards framework
Makers are the custodians
• people organisation: spaces, teams, organisations …
• workflow organisation: collections, tagging, facets ...
• credit: for submitters and authors
Open to any platform,
any subject, any person
WorkflowHub Club
TRS -Tool Registry Service API
Access:
FAIRWorkflow are FAIR Software
living and with dependencies…workflow history/provenance
Indicators of Status
Workflow
monitoring
Register versions
(Support Github actions)
Incremental metadata and
supplementary materials
(Tracking & Lifting
out subworkflows)
Which Workflow Objects are FAIR?
• workflow specification with test or
exemplar data?
• implementation of that design in a
particularWfMS?
• instantiation of that implementation
ready to run with input data, parameters
set, computational services spun up?
• run result with intermediate/final data
products and provenance logs?
• In practice this is a bit blurry.
A metadata
framework
extensible
enough to cope
FAIRWorkflows are FAIR Digital Objects
Descriptive, machine actionable metadata framework from the community
practical and developer friendly standards, extensible openendedness
Standardised
metadata about the
workflows
for registration,
discovery
Schema.org profile and types
ComputationalWorkflow
FormalParameter
ComputationalTool
Canonical workflow
description of the
workflow itself
Executable and
Abstract form
Type the input and
output data formats
of the steps
Ontology of types of data
and data identifiers, data
formats, operations in life
sciences
Upload and Download the parts?
Exchange between services & platforms?
Sharing & archiving the components of science
Lets step back!
Beyond Data: Multi-part Research = Multi-part ROs
Each object has its own
metadata and repositories
Integrated view & context over
fragmented resources using
their PIDs and metadata
Need a way of packaging up,
describing the package and
parts, citing, shipping around,
storing, archiving, sharing.
Reference real things. Like
people, mice and equipment.
Beyond Data: Multi-part Research Objects
Describing a Dataset as a
Digital Object
A way of packaging up,
describing the package and
parts, citing, shipping around,
storing, archiving, sharing.
Even reference real things. Like
people, mice and equipment.
Image Courtesy of Peter Sefton: https://arkisto-platform.github.io/standards/ro-crate/
The dataset may contain any kind of
data resource, about anything, in any
format as a file or URL. They can be
scattered across repositories.
Each resource can have a machine
readable description in JSON-LD
format
A human-readable description and
preview can be in an HTML file
that lives alongside the metadata
Provenance and workflow information
can be included - to assist in data and
research-process re-use
RO-Crate DigitalObjects may be
packaged for distribution eg via Zip,
Bagit and OCFL Objects
Courtesy Peter Sefton, https://arkisto-platform.github.io/standards/ro-crate/
A data
repository
perspective
Not just for workflows!
For any kind of object
data, publications, SOPs, software …
and data repositories!
especially data repositories!
Aggregate files, any URI-addressable content, another
RO-Crate, along with contextual information, into a citable
RO-Crate which has its own metadata.
Can use as a bag of references:
large/sensitive datasets
citation aggregator
FAIR
here
FAIR
here
Unbounded Research Objects
Anything referenceable that may be in scattered
across different repositories and/or different
datasets in the same repository.
Self describing integrated view spanning over
fragmented resources using PIDs and metadata
Metadata held alongside heterogeneous data
Infrastructure independent
• Exchange between repositories, registries and
services.
• Avoid vendor lock-in
Practical, lightweight approach Machine
and human readable, search engine friendly
and developer familiar, blah blah
FAIR Object middleware/underware
Standard Web Native PIDs + JSON-LD +
Schema.org, off the shelf archiving formats
Self-describing, Typed by profiles + add
more schema.org and domain ontologies
Extensible, descriptive and content
openendedness, honouring legacy, diversity,
and known and unknown unknowns - one size
does not fit all, blah blah
A Graph inside the RO-Crate
PIDs connect the Graph to the
outside world
http://www.researchobject.org/ro-crate/
RO-Crate variants: Profiles are extensible typing
RO-Crates collect metadata
Workflow-RO-Crate Workflow-Testing-RO-Crate
Workflow-Run-RO-Crate
*https://repository.publisso.de/resource/frl:6423291 https://www.researchobject.org/ro-crate/profiles.html
BioComputeObject-
RO-Crate
Galaxy-Workflow-RO-Crate
maDMP
RO-Crate*
DataRepo-RO-Crate
DataRepo-
DataCube-
RO-Crate
Aggregated
DataCitation
RO-Crate
Secure Bags of
PIDs to sensitive
/ large data
A step towards FAIR Digital Objects*
“To be FAIR each digital object
type has its own metadata
requirements,
and may have its own repositories
and registries”
FAIR DigitalObjects for Science: From Data Pieces toActionable
Knowledge Units: https://doi.org/10.3390/publications8020021
https://fairdo.org
FAIR Digital Objects
Actionable knowledge unit
Digital butterfly – digital twins
Bags of references
courtesy Dimitris Koureas
Coordinator DiSSCo EU
Research Infrastructure
Specimen object image
courtesy of Alex Hardisty
Specimen Data Refinery
Workflows to Digitise Natural History Specimens
FAIR DigitalObjects -> Packaged + Actionable
+
FAIR Digital Object
Framework
Open Digital Specimen
Workflow Infrastructure
courtesy of Alex Hardisty and Laurence Livermore
Real Use Cases Considered Essential!
• Building out in the open accelerated progress
RO-Crate is metadata middleware
• smart use of wheels already invented
• it takes a village: get tools, services on board
• developer friendly, firm best practice
A little bit of semantics goes a long way…
• Schema.org + JSON-LD
…prepare for more
Known and Unknown unknowns, One size does not fit all
• descriptive openendedness , multi-interpretation
Metadata sucks
• auto-curation is the way forward folks!
What about
the workout?
What about
FAIR?
FAIR at multiple levels & granularities
• Workflows & RO-Crates are composite and
nested, with dependencies
• FAIR all the way down
• Not always compatible – e.g. licenses
FAIR+
• Reusable and Usable workflows- testing &
parameter validation. Documentation.
FAIR software paradigm is pervasive
• Applies to RO-Crate Research Objects
FAIR takes a village, of course
C. Goble, S. Cohen-Boulakia, S. Soiland-Reyes,
D.Garijo,Y. Gil, M.R. Crusoe, K. Peters & D.
Schober. FAIR computational workflows. Data
Intelligence 2(2020), 108–121.
doi: 10.1162/dint_a_00033
What about DataVerse?
Workflows have data and software
characteristics
RO-Crate preserves metadata and the objects
– workflow, data, datasets whatever…
• Archive/republish independent of
WorkflowHub
• Move content from one repository to
another, one service to another
• Point to content and don’t move it
• Sharing reproducible results & methods
Set data and
workflows and their
metadata free!
RO-Crate RepositoryCollection, RepositoryObject
represents records in a repository to describe an export from a repository or
digital library
https://www.researchobject.org/ro-crate/community
https://about.workflowhub.eu/community/

Más contenido relacionado

La actualidad más candente

An Introduction to SPARQL
An Introduction to SPARQLAn Introduction to SPARQL
An Introduction to SPARQLOlaf Hartig
 
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOLinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOChris Mungall
 
Introduction to RDF
Introduction to RDFIntroduction to RDF
Introduction to RDFNarni Rajesh
 
SPARQL introduction and training (130+ slides with exercices)
SPARQL introduction and training (130+ slides with exercices)SPARQL introduction and training (130+ slides with exercices)
SPARQL introduction and training (130+ slides with exercices)Thomas Francart
 
RDF, SPARQL and Semantic Repositories
RDF, SPARQL and Semantic RepositoriesRDF, SPARQL and Semantic Repositories
RDF, SPARQL and Semantic RepositoriesMarin Dimitrov
 
Defined versus Asserted Classes: Working with the OWL Ontologies
Defined versus Asserted Classes: Working with the OWL OntologiesDefined versus Asserted Classes: Working with the OWL Ontologies
Defined versus Asserted Classes: Working with the OWL OntologiesNeuroscience Information Framework
 
온톨로지 개념 및 표현언어
온톨로지 개념 및 표현언어온톨로지 개념 및 표현언어
온톨로지 개념 및 표현언어Dongbum Kim
 
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...Jeff Z. Pan
 
SHACL in Apache jena - ApacheCon2020
SHACL in Apache jena - ApacheCon2020SHACL in Apache jena - ApacheCon2020
SHACL in Apache jena - ApacheCon2020andyseaborne
 
Introduction To RDF and RDFS
Introduction To RDF and RDFSIntroduction To RDF and RDFS
Introduction To RDF and RDFSNilesh Wagmare
 
RDF 개념 및 구문 소개
RDF 개념 및 구문 소개RDF 개념 및 구문 소개
RDF 개념 및 구문 소개Dongbum Kim
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational WorkflowsCarole Goble
 
書誌データのLOD化: データソン的デモンストレーション
書誌データのLOD化: データソン的デモンストレーション書誌データのLOD化: データソン的デモンストレーション
書誌データのLOD化: データソン的デモンストレーションKouji Kozaki
 
Debunking some “RDF vs. Property Graph” Alternative Facts
Debunking some “RDF vs. Property Graph” Alternative FactsDebunking some “RDF vs. Property Graph” Alternative Facts
Debunking some “RDF vs. Property Graph” Alternative FactsNeo4j
 
[오원석 Kswc2010]데이터의 가치를 높이는 linked data
[오원석 Kswc2010]데이터의 가치를 높이는 linked data[오원석 Kswc2010]데이터의 가치를 높이는 linked data
[오원석 Kswc2010]데이터의 가치를 높이는 linked dataLiST Inc
 
Validating RDF data: Challenges and perspectives
Validating RDF data: Challenges and perspectivesValidating RDF data: Challenges and perspectives
Validating RDF data: Challenges and perspectivesJose Emilio Labra Gayo
 

La actualidad más candente (20)

ShEx vs SHACL
ShEx vs SHACLShEx vs SHACL
ShEx vs SHACL
 
RDF Data Model
RDF Data ModelRDF Data Model
RDF Data Model
 
An Introduction to SPARQL
An Introduction to SPARQLAn Introduction to SPARQL
An Introduction to SPARQL
 
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOLinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
 
Introduction to RDF
Introduction to RDFIntroduction to RDF
Introduction to RDF
 
SPARQL introduction and training (130+ slides with exercices)
SPARQL introduction and training (130+ slides with exercices)SPARQL introduction and training (130+ slides with exercices)
SPARQL introduction and training (130+ slides with exercices)
 
RDF, SPARQL and Semantic Repositories
RDF, SPARQL and Semantic RepositoriesRDF, SPARQL and Semantic Repositories
RDF, SPARQL and Semantic Repositories
 
Defined versus Asserted Classes: Working with the OWL Ontologies
Defined versus Asserted Classes: Working with the OWL OntologiesDefined versus Asserted Classes: Working with the OWL Ontologies
Defined versus Asserted Classes: Working with the OWL Ontologies
 
온톨로지 개념 및 표현언어
온톨로지 개념 및 표현언어온톨로지 개념 및 표현언어
온톨로지 개념 및 표현언어
 
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...
Linked Data and Knowledge Graphs -- Constructing and Understanding Knowledge ...
 
SHACL in Apache jena - ApacheCon2020
SHACL in Apache jena - ApacheCon2020SHACL in Apache jena - ApacheCon2020
SHACL in Apache jena - ApacheCon2020
 
Introduction To RDF and RDFS
Introduction To RDF and RDFSIntroduction To RDF and RDFS
Introduction To RDF and RDFS
 
RDF 개념 및 구문 소개
RDF 개념 및 구문 소개RDF 개념 및 구문 소개
RDF 개념 및 구문 소개
 
RDF validation tutorial
RDF validation tutorialRDF validation tutorial
RDF validation tutorial
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
書誌データのLOD化: データソン的デモンストレーション
書誌データのLOD化: データソン的デモンストレーション書誌データのLOD化: データソン的デモンストレーション
書誌データのLOD化: データソン的デモンストレーション
 
Debunking some “RDF vs. Property Graph” Alternative Facts
Debunking some “RDF vs. Property Graph” Alternative FactsDebunking some “RDF vs. Property Graph” Alternative Facts
Debunking some “RDF vs. Property Graph” Alternative Facts
 
ShEx by Example
ShEx by ExampleShEx by Example
ShEx by Example
 
[오원석 Kswc2010]데이터의 가치를 높이는 linked data
[오원석 Kswc2010]데이터의 가치를 높이는 linked data[오원석 Kswc2010]데이터의 가치를 높이는 linked data
[오원석 Kswc2010]데이터의 가치를 높이는 linked data
 
Validating RDF data: Challenges and perspectives
Validating RDF data: Challenges and perspectivesValidating RDF data: Challenges and perspectives
Validating RDF data: Challenges and perspectives
 

Similar a FAIR Workflows and Research Objects get a Workout

RO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital ObjectsRO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital ObjectsCarole Goble
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceCarole Goble
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Anita de Waard
 
Research Shared: researchobject.org
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.orgNorman Morrison
 
Research Object Community Update
Research Object Community UpdateResearch Object Community Update
Research Object Community UpdateCarole Goble
 
FDO as building block for digitization technology stacks
FDO as building block for digitization technology stacksFDO as building block for digitization technology stacks
FDO as building block for digitization technology stacksRaul Palma
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational WorkflowsCarole Goble
 
EOSC-Life Workflow Collaboratory
EOSC-Life Workflow CollaboratoryEOSC-Life Workflow Collaboratory
EOSC-Life Workflow CollaboratoryCarole Goble
 
Globus Integrations (GlobusWorld Tour - UCSD)
Globus Integrations (GlobusWorld Tour - UCSD)Globus Integrations (GlobusWorld Tour - UCSD)
Globus Integrations (GlobusWorld Tour - UCSD)Globus
 
Sword Cetis 2007 06 29
Sword Cetis 2007 06 29Sword Cetis 2007 06 29
Sword Cetis 2007 06 29Julie Allinson
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research ObjectsCarole Goble
 
Globus Integrations (GlobusWorld Tour - UMich)
Globus Integrations (GlobusWorld Tour - UMich)Globus Integrations (GlobusWorld Tour - UMich)
Globus Integrations (GlobusWorld Tour - UMich)Globus
 
Tripal within the Arabidopsis Information Portal - PAG XXIII
Tripal within the Arabidopsis Information Portal - PAG XXIIITripal within the Arabidopsis Information Portal - PAG XXIII
Tripal within the Arabidopsis Information Portal - PAG XXIIIVivek Krishnakumar
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the partsCarole Goble
 
ROHub-Argos integration
ROHub-Argos integrationROHub-Argos integration
ROHub-Argos integrationRaul Palma
 
DSpace-CRIS: a CRIS enhanced repository platform
DSpace-CRIS: a CRIS enhanced repository platformDSpace-CRIS: a CRIS enhanced repository platform
DSpace-CRIS: a CRIS enhanced repository platformAndrea Bollini
 
Global RDF Descriptors for Germplasm Data
Global RDF Descriptors for Germplasm DataGlobal RDF Descriptors for Germplasm Data
Global RDF Descriptors for Germplasm DataVassilis Protonotarios
 
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Blue BRIDGE
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational WorkflowsCarole Goble
 

Similar a FAIR Workflows and Research Objects get a Workout (20)

RO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital ObjectsRO-Crate: packaging metadata love notes into FAIR Digital Objects
RO-Crate: packaging metadata love notes into FAIR Digital Objects
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practice
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
Research Shared: researchobject.org
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.org
 
Research Object Community Update
Research Object Community UpdateResearch Object Community Update
Research Object Community Update
 
FDO as building block for digitization technology stacks
FDO as building block for digitization technology stacksFDO as building block for digitization technology stacks
FDO as building block for digitization technology stacks
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
EOSC-Life Workflow Collaboratory
EOSC-Life Workflow CollaboratoryEOSC-Life Workflow Collaboratory
EOSC-Life Workflow Collaboratory
 
Globus Integrations (GlobusWorld Tour - UCSD)
Globus Integrations (GlobusWorld Tour - UCSD)Globus Integrations (GlobusWorld Tour - UCSD)
Globus Integrations (GlobusWorld Tour - UCSD)
 
Sword Cetis 2007 06 29
Sword Cetis 2007 06 29Sword Cetis 2007 06 29
Sword Cetis 2007 06 29
 
Sword Cetis 2007 06 29
Sword Cetis 2007 06 29Sword Cetis 2007 06 29
Sword Cetis 2007 06 29
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research Objects
 
Globus Integrations (GlobusWorld Tour - UMich)
Globus Integrations (GlobusWorld Tour - UMich)Globus Integrations (GlobusWorld Tour - UMich)
Globus Integrations (GlobusWorld Tour - UMich)
 
Tripal within the Arabidopsis Information Portal - PAG XXIII
Tripal within the Arabidopsis Information Portal - PAG XXIIITripal within the Arabidopsis Information Portal - PAG XXIII
Tripal within the Arabidopsis Information Portal - PAG XXIII
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
 
ROHub-Argos integration
ROHub-Argos integrationROHub-Argos integration
ROHub-Argos integration
 
DSpace-CRIS: a CRIS enhanced repository platform
DSpace-CRIS: a CRIS enhanced repository platformDSpace-CRIS: a CRIS enhanced repository platform
DSpace-CRIS: a CRIS enhanced repository platform
 
Global RDF Descriptors for Germplasm Data
Global RDF Descriptors for Germplasm DataGlobal RDF Descriptors for Germplasm Data
Global RDF Descriptors for Germplasm Data
 
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
Using e-infrastructures for biodiversity conservation - Gianpaolo Coro (CNR)
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 

Más de Carole Goble

The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...Carole Goble
 
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...Carole Goble
 
Research Software Sustainability takes a Village
Research Software Sustainability takes a VillageResearch Software Sustainability takes a Village
Research Software Sustainability takes a VillageCarole Goble
 
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...Carole Goble
 
Open Research: Manchester leading and learning
Open Research: Manchester leading and learningOpen Research: Manchester leading and learning
Open Research: Manchester leading and learningCarole Goble
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...Carole Goble
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational WorkflowsCarole Goble
 
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...Carole Goble
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows Carole Goble
 
The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects Carole Goble
 
How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)Carole Goble
 
What is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can helpWhat is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can helpCarole Goble
 
FAIR History and the Future
FAIR History and the FutureFAIR History and the Future
FAIR History and the FutureCarole Goble
 
ELIXIR UK Node presentation to the ELIXIR Board
ELIXIR UK Node presentation to the ELIXIR BoardELIXIR UK Node presentation to the ELIXIR Board
ELIXIR UK Node presentation to the ELIXIR BoardCarole Goble
 
FAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research CommonsFAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research CommonsCarole Goble
 
Let’s go on a FAIR safari!
Let’s go on a FAIR safari!Let’s go on a FAIR safari!
Let’s go on a FAIR safari!Carole Goble
 
Reproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects helpReproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects helpCarole Goble
 
Reflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic careerReflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic careerCarole Goble
 
Better Software, Better Research
Better Software, Better ResearchBetter Software, Better Research
Better Software, Better ResearchCarole Goble
 
Reproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsReproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsCarole Goble
 

Más de Carole Goble (20)

The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
The ELIXIR FAIR Knowledge Ecosystem for practical know-how: RDMkit and FAIRCo...
 
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science,  a Digital Research...
Can’t Pay, Won’t Pay, Don’t Pay: Delivering open science, a Digital Research...
 
Research Software Sustainability takes a Village
Research Software Sustainability takes a VillageResearch Software Sustainability takes a Village
Research Software Sustainability takes a Village
 
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
Title: Love, Money, Fame, Nudge: Enabling Data-intensive BioScience through D...
 
Open Research: Manchester leading and learning
Open Research: Manchester leading and learningOpen Research: Manchester leading and learning
Open Research: Manchester leading and learning
 
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...RDMkit, a Research Data Management Toolkit.  Built by the Community for the ...
RDMkit, a Research Data Management Toolkit. Built by the Community for the ...
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...FAIR Data Bridging from researcher data management to ELIXIR archives in the...
FAIR Data Bridging from researcher data management to ELIXIR archives in the...
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 
The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects The swings and roundabouts of a decade of fun and games with Research Objects
The swings and roundabouts of a decade of fun and games with Research Objects
 
How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)How are we Faring with FAIR? (and what FAIR is not)
How are we Faring with FAIR? (and what FAIR is not)
 
What is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can helpWhat is Reproducibility? The R* brouhaha and how Research Objects can help
What is Reproducibility? The R* brouhaha and how Research Objects can help
 
FAIR History and the Future
FAIR History and the FutureFAIR History and the Future
FAIR History and the Future
 
ELIXIR UK Node presentation to the ELIXIR Board
ELIXIR UK Node presentation to the ELIXIR BoardELIXIR UK Node presentation to the ELIXIR Board
ELIXIR UK Node presentation to the ELIXIR Board
 
FAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research CommonsFAIRy stories: tales from building the FAIR Research Commons
FAIRy stories: tales from building the FAIR Research Commons
 
Let’s go on a FAIR safari!
Let’s go on a FAIR safari!Let’s go on a FAIR safari!
Let’s go on a FAIR safari!
 
Reproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects helpReproducible Research: how could Research Objects help
Reproducible Research: how could Research Objects help
 
Reflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic careerReflections on a (slightly unusual) multi-disciplinary academic career
Reflections on a (slightly unusual) multi-disciplinary academic career
 
Better Software, Better Research
Better Software, Better ResearchBetter Software, Better Research
Better Software, Better Research
 
Reproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trendsReproducibility (and the R*) of Science: motivations, challenges and trends
Reproducibility (and the R*) of Science: motivations, challenges and trends
 

Último

Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...ssuser79fe74
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)Areesha Ahmad
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptxAlMamun560346
 
American Type Culture Collection (ATCC).pptx
American Type Culture Collection (ATCC).pptxAmerican Type Culture Collection (ATCC).pptx
American Type Culture Collection (ATCC).pptxabhishekdhamu51
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxFarihaAbdulRasheed
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Servicemonikaservice1
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑Damini Dixit
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.Nitya salvi
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Silpa
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...ssifa0344
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedDelhi Call girls
 

Último (20)

Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
American Type Culture Collection (ATCC).pptx
American Type Culture Collection (ATCC).pptxAmerican Type Culture Collection (ATCC).pptx
American Type Culture Collection (ATCC).pptx
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Alandi Call Me 7737669865 Budget Friendly No Advance Booking
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
High Profile 🔝 8250077686 📞 Call Girls Service in GTB Nagar🍑
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 

FAIR Workflows and Research Objects get a Workout

  • 1. FAIR Workflows and Research Objects get a Workout Carole Goble The University of Manchester, UK carole.goble@manchester.ac.uk DataVerse Community Conference 2021, 15th June 2021
  • 2. EOSC-Life pan-national data & method thematic commons for bioscience data and methods Using and sharing data, tools and workflows in the cloud
  • 3. Infrastructure Zoo Flows around a Federated & Diverse System 1466 data repositories / archives 916 data format and metadata standards* Not including the institutional or national repositories like DataVerse https://fairsharing.org/ accessed May 2021 From compounds to clinical trials Primary data - Secondary use
  • 4. Infrastructure Zoo Flows around a Federated & Diverse System https://fairsharing.org/ accessed May 2021 Community domain enclaves fragmented resources flow across platforms & sovereignties Workflows as an entry point and integration mechanism Legacy • data repositories & data platforms • processing and workflow platforms
  • 5. CryoEM Image Analysis Metagenomic Pipelines Drug Discovery Quality control Replication Scrutiny Shared know-how Repetition
  • 6. SARS-CoV-2 pre-processing, monitoring, analysis https://elixir-europe.org/news/covid-19-variants-galaxy
  • 7. Beyond Data:ComputationalWorkflows as method objects to be shared, ported and reused & repurposed Multi-step Leverage third party codes Scalable processing of data Transparent research Computational Workflows Specification description Software Execution A special kind of software Separation of the workflow specification from its execution Precise description of a procedure: multi- step process coordinated by input/output data relationships (data types). Execution of computational processes (run a code, invoke a service…). Data is consumed and produced by each step.
  • 8. Beyond Data:ComputationalWorkflows as method objects to be shared, ported and reused & repurposed Multi-step Leverage third party codes Scalable processing of data Transparent research Computational Workflows <my scripts> A Zoo of Workflow Systems and “systems”* Native repositories *https://s.apache.org/existing-workflow-systems
  • 10. Beyond Data: Multi-part Research Objects dependencies and associates scattered across repositories and within repositories made at different times by different people Workflow itself Workflow associated Objects Specification descriptions Parameters Input Datasets Output Datasets Runtime details & Provenance Documentation Bind to Dependencies - Containers - Codes - Sub-workflows Bind to particular test engines Publications Image Other workflows Sub workflows Software Execution Inputs and outputs Author
  • 11. Beyond Data:ComputationalWorkflows as multi-part method objects to be shared, ported and reused & repurposed Services for FAIRWorkflows • Describe workflows with PIDs and metadata • Flow: Move workflows between services and platforms • Parts: Package (scattered) objects linked together by context (metadata files with their objects) Honouring • the legacy and diverse ecosystem • buy-in from platforms Be KISSy • practical and developer friendly standards, and webby mechanisms • extensible openendedness – unknown unknowns & diversity…. Workflow Registry Workflow Systems Repos Containers Deploys Testing Monitoring
  • 12. Open Registry forWorkflows Perpetual Development in the open by an open community https://workflowhub.eu Towards FAIR workflows and FAIR registry • Find and AccessWorkflows – Workflows may remain in their native repositories in their native form. Or can deposit. – Register (push) / Harvest (pull) • Workflows interoperability and reusability – Using metadata standards framework Makers are the custodians • people organisation: spaces, teams, organisations … • workflow organisation: collections, tagging, facets ... • credit: for submitters and authors Open to any platform, any subject, any person WorkflowHub Club
  • 13. TRS -Tool Registry Service API Access:
  • 14. FAIRWorkflow are FAIR Software living and with dependencies…workflow history/provenance Indicators of Status Workflow monitoring Register versions (Support Github actions) Incremental metadata and supplementary materials (Tracking & Lifting out subworkflows)
  • 15. Which Workflow Objects are FAIR? • workflow specification with test or exemplar data? • implementation of that design in a particularWfMS? • instantiation of that implementation ready to run with input data, parameters set, computational services spun up? • run result with intermediate/final data products and provenance logs? • In practice this is a bit blurry. A metadata framework extensible enough to cope
  • 16. FAIRWorkflows are FAIR Digital Objects Descriptive, machine actionable metadata framework from the community practical and developer friendly standards, extensible openendedness Standardised metadata about the workflows for registration, discovery Schema.org profile and types ComputationalWorkflow FormalParameter ComputationalTool Canonical workflow description of the workflow itself Executable and Abstract form Type the input and output data formats of the steps Ontology of types of data and data identifiers, data formats, operations in life sciences Upload and Download the parts? Exchange between services & platforms? Sharing & archiving the components of science
  • 17. Lets step back! Beyond Data: Multi-part Research = Multi-part ROs Each object has its own metadata and repositories Integrated view & context over fragmented resources using their PIDs and metadata Need a way of packaging up, describing the package and parts, citing, shipping around, storing, archiving, sharing. Reference real things. Like people, mice and equipment.
  • 18. Beyond Data: Multi-part Research Objects Describing a Dataset as a Digital Object A way of packaging up, describing the package and parts, citing, shipping around, storing, archiving, sharing. Even reference real things. Like people, mice and equipment. Image Courtesy of Peter Sefton: https://arkisto-platform.github.io/standards/ro-crate/
  • 19. The dataset may contain any kind of data resource, about anything, in any format as a file or URL. They can be scattered across repositories. Each resource can have a machine readable description in JSON-LD format A human-readable description and preview can be in an HTML file that lives alongside the metadata Provenance and workflow information can be included - to assist in data and research-process re-use RO-Crate DigitalObjects may be packaged for distribution eg via Zip, Bagit and OCFL Objects Courtesy Peter Sefton, https://arkisto-platform.github.io/standards/ro-crate/ A data repository perspective
  • 20. Not just for workflows! For any kind of object data, publications, SOPs, software … and data repositories! especially data repositories! Aggregate files, any URI-addressable content, another RO-Crate, along with contextual information, into a citable RO-Crate which has its own metadata. Can use as a bag of references: large/sensitive datasets citation aggregator FAIR here FAIR here
  • 21. Unbounded Research Objects Anything referenceable that may be in scattered across different repositories and/or different datasets in the same repository. Self describing integrated view spanning over fragmented resources using PIDs and metadata Metadata held alongside heterogeneous data Infrastructure independent • Exchange between repositories, registries and services. • Avoid vendor lock-in
  • 22. Practical, lightweight approach Machine and human readable, search engine friendly and developer familiar, blah blah FAIR Object middleware/underware Standard Web Native PIDs + JSON-LD + Schema.org, off the shelf archiving formats Self-describing, Typed by profiles + add more schema.org and domain ontologies Extensible, descriptive and content openendedness, honouring legacy, diversity, and known and unknown unknowns - one size does not fit all, blah blah A Graph inside the RO-Crate PIDs connect the Graph to the outside world http://www.researchobject.org/ro-crate/
  • 23. RO-Crate variants: Profiles are extensible typing RO-Crates collect metadata Workflow-RO-Crate Workflow-Testing-RO-Crate Workflow-Run-RO-Crate *https://repository.publisso.de/resource/frl:6423291 https://www.researchobject.org/ro-crate/profiles.html BioComputeObject- RO-Crate Galaxy-Workflow-RO-Crate maDMP RO-Crate* DataRepo-RO-Crate DataRepo- DataCube- RO-Crate Aggregated DataCitation RO-Crate Secure Bags of PIDs to sensitive / large data
  • 24. A step towards FAIR Digital Objects* “To be FAIR each digital object type has its own metadata requirements, and may have its own repositories and registries” FAIR DigitalObjects for Science: From Data Pieces toActionable Knowledge Units: https://doi.org/10.3390/publications8020021 https://fairdo.org
  • 25. FAIR Digital Objects Actionable knowledge unit Digital butterfly – digital twins Bags of references courtesy Dimitris Koureas Coordinator DiSSCo EU Research Infrastructure Specimen object image courtesy of Alex Hardisty
  • 26. Specimen Data Refinery Workflows to Digitise Natural History Specimens FAIR DigitalObjects -> Packaged + Actionable + FAIR Digital Object Framework Open Digital Specimen Workflow Infrastructure courtesy of Alex Hardisty and Laurence Livermore
  • 27. Real Use Cases Considered Essential! • Building out in the open accelerated progress RO-Crate is metadata middleware • smart use of wheels already invented • it takes a village: get tools, services on board • developer friendly, firm best practice A little bit of semantics goes a long way… • Schema.org + JSON-LD …prepare for more Known and Unknown unknowns, One size does not fit all • descriptive openendedness , multi-interpretation Metadata sucks • auto-curation is the way forward folks! What about the workout?
  • 28. What about FAIR? FAIR at multiple levels & granularities • Workflows & RO-Crates are composite and nested, with dependencies • FAIR all the way down • Not always compatible – e.g. licenses FAIR+ • Reusable and Usable workflows- testing & parameter validation. Documentation. FAIR software paradigm is pervasive • Applies to RO-Crate Research Objects FAIR takes a village, of course C. Goble, S. Cohen-Boulakia, S. Soiland-Reyes, D.Garijo,Y. Gil, M.R. Crusoe, K. Peters & D. Schober. FAIR computational workflows. Data Intelligence 2(2020), 108–121. doi: 10.1162/dint_a_00033
  • 29. What about DataVerse? Workflows have data and software characteristics RO-Crate preserves metadata and the objects – workflow, data, datasets whatever… • Archive/republish independent of WorkflowHub • Move content from one repository to another, one service to another • Point to content and don’t move it • Sharing reproducible results & methods Set data and workflows and their metadata free! RO-Crate RepositoryCollection, RepositoryObject represents records in a repository to describe an export from a repository or digital library