SlideShare una empresa de Scribd logo
1 de 25
Descargar para leer sin conexión
Wf4Ever:
Preserving workflows as
digital Research Objects
       Stian Soiland-Reyes
  myGrid, University of Manchester

          EGI Community Forum 2012, Workflow Systems workshop
           Leibniz Supercomputing Centre, Münich, 2012-03-28
My background

                               Taverna - Scientific Workflow Management
                                  System
                               ~85000 downloads
                               ~EU projects: SCAPE, BioVeL, HELIO,
http://www.taverna.org.uk/
                               e-Lico, VPH-SHARE, EGI-INSPiRE….

                               myExperiment - Web 3.0 virtual
                                 environment, library and social
                                 network for workflows
http://www.myexperiment.org/
                               ~5000 registered users
                               ~2200 workflows
                               ~21 different systems

                                                                          2
“A biologist would rather share their
 toothbrush than their gene name”




                                  Mike Ashburner and others
                                Professor in Dept of Genetics,
                                 University of Cambridge, UK
http://www.myexperiment.org/

       “Facebook for Scientists”           A probe into researcher behaviour
       ...but different to Facebook!

   A repository of research methods       Open source (BSD) Ruby on Rails app

 A social network of people and things       REST and SPARQL, Linked Data

 A Social Virtual Research Environment    Influenced BioCatalogue, MethodBox
                                                      and SysMO-SEEK

     myExperiment currently has 5378 members, 292 groups, 2273
                workflows, 534 files and 217 packs
 Workflow Preservation
    Research Objects
       Provenance
    Recommendation
 Astronomy and Genomics
                           http://www.wf4ever-project.org/
Wf4Ever
                                                                 Challenges
Preservation of scientific workflows   » Scientific workflows enable automation
     in data-intensive science           of scientific methods and encourage
                                         best practices to be shared
                                       » Workflows need to be preserved for
                                            › Reuse, fundamental for incremental
                                              scientific development
                                            › Method reproducibility, key for
                                              credit and publication
                                       » Workflow preservation is complex!
                                       » Heterogeneous types of information
                                         need to be aggregated, including
                                         workflows and related resources
                                         forming research objects
                                       » Research objects need to be trusted and
                                         understandable n years from now
                                       » Social aspects need to be addressed in
                                         order to support reuse in scientific
                                         communities
                                                                               7
The R.* dimensions


Reusable. The key tenet of Research                 Replayable. Studies might involve
Objects is to support the sharing and               single investigations that happen in
reuse of data, methods and processes.               milliseconds or protracted processes
Repurposeable. Reuse may also                       that take years.
involve the reuse of constituent parts of Referenceable. If research objects are
the Research Object.                      to augment or replace traditional
Repeatable. There should be sufficient publication methods, then they must be
                                          referenceable or citeable.
information in a Research Object to be
able to repeat the study, perhaps years Revealable. Third parties must be able
later.                                    to audit the steps performed in the
Reproducible. A third party can start research in order to be convinced of the
                                          validity of results.
with the same inputs and methods and
see if a prior result can be confirmed.   Respectful. Explicit representations of
                                          the provenance, lineage and flow of
                                          intellectual property.
   Replacing the Paper: The Twelve Rs of the e-Research Record” on http://blogs.nature.com/eresearch/
Wf4Ever
                                   Forms of decay
Workflow Decay
• Service decay
     • Flux/decay/unavailability
• Data decay
     • Formats/ids/standards
• Infrastructure decay
     • platform/resources


Experiment Decay
•   Methodological changes
•   New technologies
•   New resources/components
•   New data
                                                 9
Preservation, Conservation, Recreating

Preserving
Archived Record
Fixed Snapshots
Review
Rerun & Replay

Conserving
Active Instrument
Live
Rerun & Reuse
Repair & Restore

Recreating
Archived Record
Active Instrument
Live
Rebuild Recycle Repurpose

                                                                     10
Workflow Decay
                                                 Decay at different abstraction levels


                                                                               Redo




                                                                            Flux


                                                                            Flux


                                                                            Flux

                                                                                      11
http://www.gridworkflow.org/kwfgrid/gwes/docs/
Research objects




              12
Research Objects as Social Objects




13          13
                                     13
http://purl.org/wf4ever/ro#
                               Research Object model core (simplified)


                              ore:aggregates
                                                   ro:ResearchObject
        ro:Resource                                                           ore:isDescribedBy



                                                                                     ro:Manifest
wfdesc:Workflow

              ro:annotatesAggregatedResource         ro:AggregatedAnnotation

                                 Note: This figure shows a simplified view of the RO core.




   RO specification: http://wf4ever.github.com/ro/
                                                                                                   14
http://purl.org/wf4ever/ro#
Research Object model core




                                15
http://purl.org/wf4ever/wfdesc#
RO model: Workflow Description




                                     16
http://purl.org/wf4ever/wfprov#
Workflow Provenance (wfprov)




                                   17
Technical infrastructure


• Models  Semantic Web Encoding
    •   Research Object
    •   Annotation
    •   Provenance
    •   Evolution and Versioning
• Services Web APIs, REST services
    • Foundational, Extension, User
    • APIs, Architecture
• Principles
    • Map into standards
    • Adopt standards
    • Lightweight components
• Ecosystem
    • Command line
    • Portal
    • Third party systems
                                                           18
The Wf4Ever Proposal
                      Services


User
Clients



Extension
Services




Foundation
Services



                               19
Wf4Ever Reference Implementation
                                                                         Prototype, Dec 2011

   Access & Usage Clients

                                                                Dropbox Client
                   RO Portal             RO Manager Tool
                                                                       ROBox



           Data Management & Analysis Services



                     Stability              Completeness
                                                                 Recommender
                    Evaluation               Evaluation



Storage Services                                           Lifecycle Services

                                                                        Taverna Workflow
                                                                          Mgmt System
                               RO Digital Library



                                                                                           20
Roadmap
                              Year 1 (Dec 2010  Dec 2011)


» Exploration (2011)
   Problem specification and requirements identification
   Better understanding of workflow preservation needs
    from the domains (what does it mean to preserve a
    scientific workflow?)
   Proofs of concepts
   Preliminary models, components, and integrated
    reference implementation
   Result identification

                                                            21
Roadmap
                                   Year 2 (Dec 2011  Dec 2012)


Realization/validation (2012)
   › Validate the models, architectures and software in practice
   › Distributed components with different access/security
     arrangements – forming REST APIs and specifications
   › RO Content Campaign: Generate 1000s of ROs
   › First productization phase: Stable releases of models and
     reference implementation
   › Decay monitoring and notification (why my wf is no longer
     stable), reacting to decay, attribution and credit support
     beyond recommendation. Detailed use of provenance
   › Execution and interoperability support (SHIWA integration)
                                                                    22
Roadmap
                                 Year 3 (Dec 2012  Dec 2013)


» Exploitation (2013)
   › Final productization phase
   › Deployment in user environments and systems, enhanced with
     workflow preservation capabilities
   › RO-enabled myExperiment
   › RO-enabled Galaxy
   › RO-enabled dataVerse
   › … and more!
   › Deployment in publishers e.g. Elsevier, Digital Science,
     GigaScience

                                                                  23
Collaborations and impact
»   SHIWA – Sharing Interoperable Workflows
»   Publishers/journals: Elsevier, GigaScience (by BGI)
»   OpenPHACTS (nanopublications)
»   SCAPE (dataset preservation)
»   BioVel (biodiversity - species preservation!)
»   Dataverse (data repository)
»   Galaxy (workflow system for genomics)
»   GenomeSpace (data integration platform)




                                                             24
Thank you!




                                      Any Questions?

                     http://www.wf4ever-project.org/




This work is licensed under the Creative Commons Attribution 3.0
Unported License. To view a copy of this license, visit
http://creativecommons.org/licenses/by/3.0/ or send a letter to Creative
Commons, 444 Castro Street, Suite 900, Mountain View, California,
94041, USA.                                                                        25

Más contenido relacionado

Destacado

2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)Stian Soiland-Reyes
 
2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)Stian Soiland-Reyes
 
2011-06-08 Taverna workflow system
2011-06-08 Taverna workflow system2011-06-08 Taverna workflow system
2011-06-08 Taverna workflow systemStian Soiland-Reyes
 
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTXTaverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTXStian Soiland-Reyes
 
2015-07-11 Apache Taverna - BOSC 2015
2015-07-11 Apache Taverna - BOSC 20152015-07-11 Apache Taverna - BOSC 2015
2015-07-11 Apache Taverna - BOSC 2015Stian Soiland-Reyes
 
2016-05-18-Make research reproducible again - researchobject.org
2016-05-18-Make research reproducible again - researchobject.org2016-05-18-Make research reproducible again - researchobject.org
2016-05-18-Make research reproducible again - researchobject.orgStian Soiland-Reyes
 

Destacado (6)

2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)2013 06-24 Wf4Ever: Annotating research objects (PPTX)
2013 06-24 Wf4Ever: Annotating research objects (PPTX)
 
2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)2013 06-24 Wf4Ever: Annotating research objects (PDF)
2013 06-24 Wf4Ever: Annotating research objects (PDF)
 
2011-06-08 Taverna workflow system
2011-06-08 Taverna workflow system2011-06-08 Taverna workflow system
2011-06-08 Taverna workflow system
 
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTXTaverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
Taverna workflow management system (2010 11-30 Bath Workflow Tools) PPTX
 
2015-07-11 Apache Taverna - BOSC 2015
2015-07-11 Apache Taverna - BOSC 20152015-07-11 Apache Taverna - BOSC 2015
2015-07-11 Apache Taverna - BOSC 2015
 
2016-05-18-Make research reproducible again - researchobject.org
2016-05-18-Make research reproducible again - researchobject.org2016-05-18-Make research reproducible again - researchobject.org
2016-05-18-Make research reproducible again - researchobject.org
 

Similar a 2012 03-28 Wf4ever, preserving workflows as digital research objects

myExperiment and the Rise of Social Machines
myExperiment and the Rise of Social MachinesmyExperiment and the Rise of Social Machines
myExperiment and the Rise of Social MachinesDavid De Roure
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research ObjectsCarole Goble
 
2011 03-provenance-workshop-edingurgh
2011 03-provenance-workshop-edingurgh2011 03-provenance-workshop-edingurgh
2011 03-provenance-workshop-edingurghJun Zhao
 
Research Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibilityResearch Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibilityOscar Corcho
 
Metadata for Research Objects
Metadata for Research ObjectsMetadata for Research Objects
Metadata for Research Objectsseanb
 
Research Objects Tutorial (TPDL)
Research Objects Tutorial (TPDL)Research Objects Tutorial (TPDL)
Research Objects Tutorial (TPDL)dgarijo
 
Research Shared: researchobject.org
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.orgNorman Morrison
 
Ethics reproducibility and data stewardship
Ethics reproducibility and data stewardshipEthics reproducibility and data stewardship
Ethics reproducibility and data stewardshipRussell Jarvis
 
Towards Computational Research Objects
Towards Computational Research ObjectsTowards Computational Research Objects
Towards Computational Research ObjectsDavid De Roure
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout Carole Goble
 
Libraries, OA research and OER: towards symbiosis?
Libraries, OA research and OER: towards symbiosis?Libraries, OA research and OER: towards symbiosis?
Libraries, OA research and OER: towards symbiosis?Nick Sheppard
 
Research Objects for FAIRer Science
Research Objects for FAIRer Science Research Objects for FAIRer Science
Research Objects for FAIRer Science Carole Goble
 
Deroure Repo3
Deroure Repo3Deroure Repo3
Deroure Repo3guru122
 
Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011Rudy Potenzone
 
Research Objects in Scientific Publications
Research Objects in Scientific PublicationsResearch Objects in Scientific Publications
Research Objects in Scientific Publicationsdgarijo
 
Chem4Word Wade
Chem4Word WadeChem4Word Wade
Chem4Word WadeAlex Wade
 

Similar a 2012 03-28 Wf4ever, preserving workflows as digital research objects (20)

A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
 
Research Objects in Wf4Ever
Research Objects in Wf4EverResearch Objects in Wf4Ever
Research Objects in Wf4Ever
 
myExperiment and the Rise of Social Machines
myExperiment and the Rise of Social MachinesmyExperiment and the Rise of Social Machines
myExperiment and the Rise of Social Machines
 
The Rhetoric of Research Objects
The Rhetoric of Research ObjectsThe Rhetoric of Research Objects
The Rhetoric of Research Objects
 
2011 03-provenance-workshop-edingurgh
2011 03-provenance-workshop-edingurgh2011 03-provenance-workshop-edingurgh
2011 03-provenance-workshop-edingurgh
 
Research Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibilityResearch Objects for improved sharing and reproducibility
Research Objects for improved sharing and reproducibility
 
Metadata for Research Objects
Metadata for Research ObjectsMetadata for Research Objects
Metadata for Research Objects
 
Research Objects Tutorial (TPDL)
Research Objects Tutorial (TPDL)Research Objects Tutorial (TPDL)
Research Objects Tutorial (TPDL)
 
Research Shared: researchobject.org
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.org
 
Ethics reproducibility and data stewardship
Ethics reproducibility and data stewardshipEthics reproducibility and data stewardship
Ethics reproducibility and data stewardship
 
Towards Computational Research Objects
Towards Computational Research ObjectsTowards Computational Research Objects
Towards Computational Research Objects
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout
 
Libraries, OA research and OER: towards symbiosis?
Libraries, OA research and OER: towards symbiosis?Libraries, OA research and OER: towards symbiosis?
Libraries, OA research and OER: towards symbiosis?
 
Research Objects for FAIRer Science
Research Objects for FAIRer Science Research Objects for FAIRer Science
Research Objects for FAIRer Science
 
Deroure Repo3
Deroure Repo3Deroure Repo3
Deroure Repo3
 
Deroure Repo3
Deroure Repo3Deroure Repo3
Deroure Repo3
 
Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011Acs denver dirks potenzone 30 aug2011
Acs denver dirks potenzone 30 aug2011
 
Workflow Preservation
Workflow PreservationWorkflow Preservation
Workflow Preservation
 
Research Objects in Scientific Publications
Research Objects in Scientific PublicationsResearch Objects in Scientific Publications
Research Objects in Scientific Publications
 
Chem4Word Wade
Chem4Word WadeChem4Word Wade
Chem4Word Wade
 

Más de Stian Soiland-Reyes

2017-11-03 Scientific Workflow systems
2017-11-03 Scientific Workflow systems2017-11-03 Scientific Workflow systems
2017-11-03 Scientific Workflow systemsStian Soiland-Reyes
 
2017-11-03 Provenance and Research Object
2017-11-03 Provenance and Research Object2017-11-03 Provenance and Research Object
2017-11-03 Provenance and Research ObjectStian Soiland-Reyes
 
2017-07-22 Common Workflow Language Viewer
2017-07-22 Common Workflow Language Viewer2017-07-22 Common Workflow Language Viewer
2017-07-22 Common Workflow Language ViewerStian Soiland-Reyes
 
2014-10-31 Taverna 3 architecture
2014-10-31 Taverna 3 architecture2014-10-31 Taverna 3 architecture
2014-10-31 Taverna 3 architectureStian Soiland-Reyes
 
2014-10-30 Taverna as an Apache Incubator project
2014-10-30 Taverna as an Apache Incubator project2014-10-30 Taverna as an Apache Incubator project
2014-10-30 Taverna as an Apache Incubator projectStian Soiland-Reyes
 
2014-06-13 Research objects in the wild
2014-06-13 Research objects in the wild2014-06-13 Research objects in the wild
2014-06-13 Research objects in the wildStian Soiland-Reyes
 
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)Stian Soiland-Reyes
 
2013-03-21 What can provenance do for me?
2013-03-21 What can provenance do for me?2013-03-21 What can provenance do for me?
2013-03-21 What can provenance do for me?Stian Soiland-Reyes
 
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...Stian Soiland-Reyes
 
Taverna workflow management system (2010 11-30 Bath Workflow Tools)
Taverna workflow management system (2010 11-30 Bath Workflow Tools)Taverna workflow management system (2010 11-30 Bath Workflow Tools)
Taverna workflow management system (2010 11-30 Bath Workflow Tools)Stian Soiland-Reyes
 
Bringing caBIG services together using Taverna
Bringing caBIG services together using TavernaBringing caBIG services together using Taverna
Bringing caBIG services together using TavernaStian Soiland-Reyes
 

Más de Stian Soiland-Reyes (14)

2017-09-27-scholarly-html-ro
2017-09-27-scholarly-html-ro2017-09-27-scholarly-html-ro
2017-09-27-scholarly-html-ro
 
2017-11-03 Scientific Workflow systems
2017-11-03 Scientific Workflow systems2017-11-03 Scientific Workflow systems
2017-11-03 Scientific Workflow systems
 
2017-11-03 Provenance and Research Object
2017-11-03 Provenance and Research Object2017-11-03 Provenance and Research Object
2017-11-03 Provenance and Research Object
 
2017-07-22 Common Workflow Language Viewer
2017-07-22 Common Workflow Language Viewer2017-07-22 Common Workflow Language Viewer
2017-07-22 Common Workflow Language Viewer
 
2014-10-31 Taverna 3 architecture
2014-10-31 Taverna 3 architecture2014-10-31 Taverna 3 architecture
2014-10-31 Taverna 3 architecture
 
2014-10-30 Taverna 3 status
2014-10-30 Taverna 3 status2014-10-30 Taverna 3 status
2014-10-30 Taverna 3 status
 
2014-10-30 Taverna as an Apache Incubator project
2014-10-30 Taverna as an Apache Incubator project2014-10-30 Taverna as an Apache Incubator project
2014-10-30 Taverna as an Apache Incubator project
 
2014-06-13 Research objects in the wild
2014-06-13 Research objects in the wild2014-06-13 Research objects in the wild
2014-06-13 Research objects in the wild
 
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
2013-07-19 myExperiment research objects, beyond workflows and packs (PPTX)
 
2013-05-29 Taverna Provenance
2013-05-29 Taverna Provenance2013-05-29 Taverna Provenance
2013-05-29 Taverna Provenance
 
2013-03-21 What can provenance do for me?
2013-03-21 What can provenance do for me?2013-03-21 What can provenance do for me?
2013-03-21 What can provenance do for me?
 
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
2011 07-06 SCUFL2 Poster - because a workflow is more than its definition (BO...
 
Taverna workflow management system (2010 11-30 Bath Workflow Tools)
Taverna workflow management system (2010 11-30 Bath Workflow Tools)Taverna workflow management system (2010 11-30 Bath Workflow Tools)
Taverna workflow management system (2010 11-30 Bath Workflow Tools)
 
Bringing caBIG services together using Taverna
Bringing caBIG services together using TavernaBringing caBIG services together using Taverna
Bringing caBIG services together using Taverna
 

Último

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????blackmambaettijean
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 

Último (20)

Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
What is Artificial Intelligence?????????
What is Artificial Intelligence?????????What is Artificial Intelligence?????????
What is Artificial Intelligence?????????
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate AgentsRyan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
Ryan Mahoney - Will Artificial Intelligence Replace Real Estate Agents
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 

2012 03-28 Wf4ever, preserving workflows as digital research objects

  • 1. Wf4Ever: Preserving workflows as digital Research Objects Stian Soiland-Reyes myGrid, University of Manchester EGI Community Forum 2012, Workflow Systems workshop Leibniz Supercomputing Centre, Münich, 2012-03-28
  • 2. My background Taverna - Scientific Workflow Management System ~85000 downloads ~EU projects: SCAPE, BioVeL, HELIO, http://www.taverna.org.uk/ e-Lico, VPH-SHARE, EGI-INSPiRE…. myExperiment - Web 3.0 virtual environment, library and social network for workflows http://www.myexperiment.org/ ~5000 registered users ~2200 workflows ~21 different systems 2
  • 3. “A biologist would rather share their toothbrush than their gene name” Mike Ashburner and others Professor in Dept of Genetics, University of Cambridge, UK
  • 4. http://www.myexperiment.org/  “Facebook for Scientists”  A probe into researcher behaviour ...but different to Facebook!  A repository of research methods  Open source (BSD) Ruby on Rails app  A social network of people and things  REST and SPARQL, Linked Data  A Social Virtual Research Environment  Influenced BioCatalogue, MethodBox and SysMO-SEEK myExperiment currently has 5378 members, 292 groups, 2273 workflows, 534 files and 217 packs
  • 5.
  • 6.  Workflow Preservation  Research Objects  Provenance  Recommendation  Astronomy and Genomics http://www.wf4ever-project.org/
  • 7. Wf4Ever Challenges Preservation of scientific workflows » Scientific workflows enable automation in data-intensive science of scientific methods and encourage best practices to be shared » Workflows need to be preserved for › Reuse, fundamental for incremental scientific development › Method reproducibility, key for credit and publication » Workflow preservation is complex! » Heterogeneous types of information need to be aggregated, including workflows and related resources forming research objects » Research objects need to be trusted and understandable n years from now » Social aspects need to be addressed in order to support reuse in scientific communities 7
  • 8. The R.* dimensions Reusable. The key tenet of Research Replayable. Studies might involve Objects is to support the sharing and single investigations that happen in reuse of data, methods and processes. milliseconds or protracted processes Repurposeable. Reuse may also that take years. involve the reuse of constituent parts of Referenceable. If research objects are the Research Object. to augment or replace traditional Repeatable. There should be sufficient publication methods, then they must be referenceable or citeable. information in a Research Object to be able to repeat the study, perhaps years Revealable. Third parties must be able later. to audit the steps performed in the Reproducible. A third party can start research in order to be convinced of the validity of results. with the same inputs and methods and see if a prior result can be confirmed. Respectful. Explicit representations of the provenance, lineage and flow of intellectual property. Replacing the Paper: The Twelve Rs of the e-Research Record” on http://blogs.nature.com/eresearch/
  • 9. Wf4Ever Forms of decay Workflow Decay • Service decay • Flux/decay/unavailability • Data decay • Formats/ids/standards • Infrastructure decay • platform/resources Experiment Decay • Methodological changes • New technologies • New resources/components • New data 9
  • 10. Preservation, Conservation, Recreating Preserving Archived Record Fixed Snapshots Review Rerun & Replay Conserving Active Instrument Live Rerun & Reuse Repair & Restore Recreating Archived Record Active Instrument Live Rebuild Recycle Repurpose 10
  • 11. Workflow Decay Decay at different abstraction levels Redo Flux Flux Flux 11 http://www.gridworkflow.org/kwfgrid/gwes/docs/
  • 13. Research Objects as Social Objects 13 13 13
  • 14. http://purl.org/wf4ever/ro# Research Object model core (simplified) ore:aggregates ro:ResearchObject ro:Resource ore:isDescribedBy ro:Manifest wfdesc:Workflow ro:annotatesAggregatedResource ro:AggregatedAnnotation Note: This figure shows a simplified view of the RO core. RO specification: http://wf4ever.github.com/ro/ 14
  • 18. Technical infrastructure • Models  Semantic Web Encoding • Research Object • Annotation • Provenance • Evolution and Versioning • Services Web APIs, REST services • Foundational, Extension, User • APIs, Architecture • Principles • Map into standards • Adopt standards • Lightweight components • Ecosystem • Command line • Portal • Third party systems 18
  • 19. The Wf4Ever Proposal Services User Clients Extension Services Foundation Services 19
  • 20. Wf4Ever Reference Implementation Prototype, Dec 2011 Access & Usage Clients Dropbox Client RO Portal RO Manager Tool ROBox Data Management & Analysis Services Stability Completeness Recommender Evaluation Evaluation Storage Services Lifecycle Services Taverna Workflow Mgmt System RO Digital Library 20
  • 21. Roadmap Year 1 (Dec 2010  Dec 2011) » Exploration (2011) Problem specification and requirements identification Better understanding of workflow preservation needs from the domains (what does it mean to preserve a scientific workflow?) Proofs of concepts Preliminary models, components, and integrated reference implementation Result identification 21
  • 22. Roadmap Year 2 (Dec 2011  Dec 2012) Realization/validation (2012) › Validate the models, architectures and software in practice › Distributed components with different access/security arrangements – forming REST APIs and specifications › RO Content Campaign: Generate 1000s of ROs › First productization phase: Stable releases of models and reference implementation › Decay monitoring and notification (why my wf is no longer stable), reacting to decay, attribution and credit support beyond recommendation. Detailed use of provenance › Execution and interoperability support (SHIWA integration) 22
  • 23. Roadmap Year 3 (Dec 2012  Dec 2013) » Exploitation (2013) › Final productization phase › Deployment in user environments and systems, enhanced with workflow preservation capabilities › RO-enabled myExperiment › RO-enabled Galaxy › RO-enabled dataVerse › … and more! › Deployment in publishers e.g. Elsevier, Digital Science, GigaScience 23
  • 24. Collaborations and impact » SHIWA – Sharing Interoperable Workflows » Publishers/journals: Elsevier, GigaScience (by BGI) » OpenPHACTS (nanopublications) » SCAPE (dataset preservation) » BioVel (biodiversity - species preservation!) » Dataverse (data repository) » Galaxy (workflow system for genomics) » GenomeSpace (data integration platform) 24
  • 25. Thank you! Any Questions? http://www.wf4ever-project.org/ This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/ or send a letter to Creative Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041, USA. 25