Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

The Research Object Initiative: Frameworks and Use Cases

Presentation to NIH BioCADDIE Big Data 2 Knowledge (BD2K) Data Discovery Index Webinar series. 11 June 2015

Research Objects

Audiolibros relacionados

Gratis con una prueba de 30 días de Scribd

Ver todo
  • Sé el primero en comentar

The Research Object Initiative: Frameworks and Use Cases

  1. 1. The Research Object Initiative: Frameworks and Use Cases Professor Carole Goble The University of Manchester, UK carole.goble@manchester.ac.uk NIH BD2K BioCADDIE webinar, 11 June 2015
  2. 2. From Manuscripts to Research Objects “An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment, [the complete data] and the complete set of instructions which generated the figures.” David Donoho, “Wavelab and Reproducible Research,” 1995 Datasets, Data collections Standard operating procedures Software, algorithms Configurations, Tools and apps, services Codes, code libraries Workflows, scripts System software Infrastructure Compilers, hardware
  3. 3. Scattered Assets
  4. 4. Concept
  5. 5. Drivers for Research Objects (1) • Computational Workflows / Scripts – Multi-step, nested. – Data, executable codes, services (remote and local), libraries – Preservation, Repair – Reproducibility • Systems Biology – Models, data (construction, validation, predicted), SOPs, samples – Structured around Investigations, Studies, Assays – Exchange – Reproducibility
  6. 6. Drivers for Research Objects (2) • ComputationalWorkflows Commons – Projects and individuals – myExperiment.org • Systems Biology Commons – Modellers and experimentalists – Projects and Programs – Catalogue of research assets – Fairdomhub.org – Fair-dom.org – Seek4science.org
  7. 7. "Mapping present and future predicted distribution patterns for a meso-grazer guild in the Baltic Sea" by Sonja Leidenberger et al Workflow Commons
  8. 8. https://doi.org/10.15490/seek.1.investigation.56
  9. 9. [Snoep, 2015] https://doi.org/10.15490/seek.1.investigation.56 Penkler et al (2015) FEBSJ 282:1481-1511.
  10. 10. https://sems.uni-rostock.de/reproducible-and-citable-data-and-models/
  11. 11. Local Repositories LIMS Public Repositories Central repositories Funding Agencies Catalogue Search Index Tools Research Infrastructures execute companion site CRIS results gateway catalogue Standards metadata Consumers Producers Publishers haven platform Commons
  12. 12. Research Objects 1. Multi-various, citable research products
  13. 13. Research Objects 2. Compound, nested, scattered, yet interconnected research products, structured investigations
  14. 14. Research Objects 3. Preserved, Portable research products, inter-platform exchange, reproducibility Pop-up projects Dynamic groups Internal / external visibility Commons
  15. 15. Research Objects 4. Active research products: evolving. executable. • Fork. • Merge. • Version. • Cite • Snapshot. • Live. [Martin Scharm] Haus et al, BMC Systems Biology, 2011, 5:10 Solvent production by Clostridium acetobutylicum
  16. 16. Bigger on the inside than the outside cite? resolve? steward? closed embed fixed local open alien refer fluid Content TARDIS Time and Relative Dimension in Space Scholarship Multi Span type steward site author research researchers platforms time Contributions
  17. 17. Bigger on the inside than the outside cite? resolve? steward? closed embed fixed local open alien refer fluid Content TARDIS Time and Relative Dimension in Space Scholarship Multi Span type steward site author research researchers platforms time Contributions
  18. 18. Goble, De Roure, Bechhofer, Accelerating KnowledgeTurns, I3CK, 2013 Knowledge Turning interpret Commons FAIR Research Products Reproducibility Interpretation Comparison Preservation Portability Release Active Research http://ccrtypewriter.blogspot.co.uk/ Research Objectmeans ends driver
  19. 19. Framework
  20. 20. Multi-various products, platforms, resources First class citizens - id, manage, credit, track, profile, focus A Framework to Bundle, Port and Link (scattered) resources, related experiments. Metadata Objects that carry Research Context. Units of exchange. Research Objects http://www.researchobject.org
  21. 21. The Research Object Framework Desiderata Technology Independent. The least possible. The simplest feasible. Graceful degradation.
  22. 22. Research Object Framework Principles & Conventions API specificationMetadata formats RO Core model using standards Annotation profiles progressive extensionsAdobe UCF ORE ODF OADM/ PROV
  23. 23. Research Object Framework Principles & Conventions API specification Platform Profiles using legacy & commodity platforms Metadata formats Policies Services Tools Lifecycle Steward Ship Training … Commodity Native RO Core model using standards Annotation profiles progressive extensionsAdobe UCF ORE ODF OADM/ PROV
  24. 24. Identity Aggregation Interpretation: The objects How they are linked together RO Core Model manifest Refer to aggregations and their contents Describe group & constituents External ids Local files Attribution: Who , when, where, why? Metadata Description
  25. 25. RO Core Model Aggregations Resource maps Proxies Annotation first class and stand-off Identity persistence and resolution, Names Citation Identity Annotation Aggregation DOIs URIs Handles ORCID W3C OADM OAI- ORE manifest Point of extendability
  26. 26. Identity Annotation Aggregation RO Core Platform Profiles DOIs URIs Handles ORCID Data Citation Implementation OAI- ORE W3C OADM
  27. 27. RO Model Ontology http://w3id.org/ro/ Defines core concepts of research objects, identity, aggregation, annotation. Used in the manifest
  28. 28. Metadata Objects Manifest The Container Manifest content and the relationships between the content • RO metadata- id, title, creator, status…. • Aggregates – list of ids/links to resources • Annotations – list of annotations about resources The Objects • Remote, through links • Locally, embedded
  29. 29. Manifest – remote and local on my machine
  30. 30. Container Machinery Manifest The Container Packaging: Zip files, DOCKER Images… Catalogues & Commons: FAIRDOM SEEK, Farr Commons CKAN, myExperiment… The Container Manifest content and the relationships between the content
  31. 31. Export, archive, publish and transfer ROs. File format for storage and distribution of ROs as a ZIP archive Includes an RO’s manifest, annotations and some or all of its aggregated resources Basis for more specific file formats Backwards compatible: its zip Programmatic access: JSON and JSON-LD manifest, API https://researchobject.github.io/specifications/bundle/ https://w3id.org/bundle/ doi:10.5281/zenodo.10440
  32. 32. https://researchobject.github.io/specifications/bundle/ https://w3id.org/bundle/ doi:10.5281/zenodo.10440
  33. 33. http://www.cnri.reston.va.us/papers/OverviewDigitalObjectArchit ecture.pdf RO Lifecycles, Resolution, Citation • Defend it (snapshot) • Locate it (most recent) • Reuse it (a version, a component) • Credit it (contributory authorship) • Cross link it (connections) PURL
  34. 34. Checklists Versioning Provenance Dependencies Annotation Profiles . Depth: how deeply described Coverage: how much is covered. Progression levels Semantic Framework PID The Manifest The Object Metadata PAV VoID VIVO-ISF PAV Mim Ontology Puppet, Makefile Less detail, more stakeholders
  35. 35. Checklists Gamble M, Goble CA, Klyne G, Zhao J Mim: A minimum information model vocabulary and framework for scientific linked data IEEE 8th Intl Conf on eScience pp: 1-8 Zhao J, Klyne G, Gamble M, Goble CA - A Checklist- Based Approach for Quality Assessment of Scientific Information Proc Third Linked Science Workshop 2013, co-located ISWC2013.
  36. 36. Library Publishers Experiments Type specific PID Citation NISO- JATS Dublin Core ISA MIAME Wf-Desc Checklist Annotation Profiles . OBI SBML, SED-ML JERM EXPO Wf-prov Gamble M, Goble CA, Klyne G, Zhao J Mim: A minimum information model vocabulary and framework for scientific linked data IEEE 8th Intl Conf on eScience pp: 1-8
  37. 37. Use Cases
  38. 38. Use case • SEEK Commons for Systems Biology • Natively RO • Export/Import RO bundles
  39. 39. SEEK Metadata framework link studies and link assets Describes common elements and relationships between things produced and used in experiments. Structured descriptions for consistency and comparison Just Enough Results Model
  40. 40. Snapshots & Living Living ROs Snapshot RO of investigation and all its parts
  41. 41. Community Sys Bio Models metadata + packaging Bergmann, Rodriguez, Le Novère. COMBINE archive specification. <http://identifiers.org/combine.specifications/o mex.version-1> (2014) Bergman et al COMBINE archive and OMEX format: one file to share all information to reproduce a modeling project, BMC Bioinformatics 2014, 15:369 Combine with RO. Standardised metadata & API http://co.mbine.org/documents/archive https://github.com/stain/ro-combine-archive doi:10.5281/zenodo.10439
  42. 42. Bridge from Research to FAIR publishing Deposit Run
  43. 43. RO Unzip
  44. 44. RO Query
  45. 45. Use Case: Taverna Workflows
  46. 46. Workflow Results workflowrun.prov.ttl (RDF) outputA.txt outputC.jpg outputB/ https://w3id.org/bundle intermediates/ 1.txt 2.txt 3.txt de/def2e58b-50e2-4949-9980-fd310166621a.txt inputA.txt workflow URI references attribution execution environment Aggregating in Research Object ZIP folder structure (RO Bundle) mimetype application/vnd.wf4ever.robundle+zip .ro/manifest.json
  47. 47. Workflow Specification Example data and config. Components. Plug-ins,Versions Workflow System Software package Workflow Runs Data and configs Provenance logs Study Asset specific Commons Personal Notebook Community Registry General Publishing Repository
  48. 48. Use case: ATLAS Collider Data Analytics Portable, lightweight application runtime and packaging tool. Image ATLAS and CMS detector data CharlesVardeman, Da Huo All data and files of the execution + Instructions convert bundle manifest Relate files and layers Add provenance and annotations Link in other content
  49. 49. Use case: The Farr Institute Commons safe use of patient and research data for medical research clinical study cohorts Research Objects: scripts, data, samples… different e-Labs, legacy data http://www.farrinstitute.org/
  50. 50. Use case: The Farr Institute Commons The open source data portal software exchange catalogue deposit
  51. 51. Use case: The Farr Institute Commons The open source data portal software exchange catalogue deposit
  52. 52. Uses “code as a research object” functionality
  53. 53. Baking RO Infrastructure make, import, export, inspect, render, version, process, check, … • Libraries – Create and inspect RO Bundles and their metadata – Java, Ruby and Python • User tools – RO Manager: command line tool to make ROs – ROHUB: a prototype web app to manage ROs • Platforms – SEEK – CKAN plug-in to build, import and export ROs http://www.researchobject.org/specifications/
  54. 54. NIH BD2K + Research Objects Metadata Profiles RO Model API Community IDs* RO Model Manifest Profile Implementation Profiles *BioMedBridges 10 Rules for Identifiers.
  55. 55. Summary FAIR Research Objects: • Concept, model, framework, use cases • Lightweight, Incremental Challenges • Multi-stewarding and lifecycles (OAIS) • Policy, governance Partnerships • Figshare, Oxford Bodliean, Farr Institute • BioCADDIE?
  56. 56. Acknowledgements & Links Stian Soiland-Reyes Matt Gamble Rob Haines Sean Bechhofer Norman Morrison Phil Crouch Finn Bacall Stuart Owen Carole Goble Khalid Belhajjame Graham Klyne Jun Zhao Daniel Garijo, Oscar Corcho Esteban García Cuesta University of Manchester University of Oxford Lancaster University UPM http://researchobject.org http://fair-dom.org http://www.seek4science.org http://www.farrinstitute.org http://www.wf4ever-project.org http://myexperiment.org Raul Palma iSOCO PSNC Paris 6

×