Se ha denunciado esta presentación.
Utilizamos tu perfil de LinkedIn y tus datos de actividad para personalizar los anuncios y mostrarte publicidad más relevante. Puedes cambiar tus preferencias de publicidad en cualquier momento.

How are we Faring with FAIR? (and what FAIR is not)

Keynote presented at the workshop FAIRe Data Infrastructures, 15 October 2020

Remarkably it was only in 2016 that the ‘FAIR Guiding Principles for scientific data management and stewardship’ appeared in Scientific Data. The paper was intended to launch a dialogue within the research and policy communities: to start a journey to wider accessibility and reusability of data and prepare for automation-readiness by supporting findability, accessibility, interoperability and reusability for machines. Many of the authors (including myself) came from biomedical and associated communities.  The paper succeeded in its aim, at least at the policy, enterprise and professional data infrastructure level. Whether FAIR has impacted the researcher at the bench or bedside is open to doubt. It certainly inspired a great deal of activity, many projects, a lot of positioning of interests and raised awareness. COVID has injected impetus and urgency to the FAIR cause (good) and also highlighted its politicisation (not so good).
In this talk I’ll make some personal reflections on how we are faring with FAIR: as one of the original principles authors; as a participant in many current FAIR initiatives (particularly in the biomedical sector and for research objects other than data) and as a veteran of FAIR before we had the principles.

  • Inicia sesión para ver los comentarios

  • Sé el primero en recomendar esto

How are we Faring with FAIR? (and what FAIR is not)

  1. 1. How are we Faring with FAIR (and what FAIR is not) Carole Goble The University of Manchester FAIRDOM ELIXIR, EOSC-Life, IBISBA, BioExcel CoE, FAIRplus The views expressed in this talk are my own Workshop: FAIRe Data Infrastructures, 15 October 2020
  2. 2. Data discovery and reuse at scale through good data management 2016 A set of PRINCIPLES to enhance the value of all digital resources and their reuse by PEOPLE and by MACHINES Scientific Data 3, 160018 (2016) doi:10.1038/sdata.2016.18 [Credit: Susanna Sansone]
  3. 3. Branding a trend, stimulating a movement … 2014 2016 2015 Open Science Data Sharing Recognition & Credit Reproducibility Data-driven Science Automation, AI
  4. 4. The compulsory COVID-19 reference “COVID is a good example .. there must be loads of legacy data. We’re desperately trying to go back and look at what we knew from SARS 10 years ago” – Pharma manager, FAIRplus project clinical-rda-covid19-1
  5. 5. The compulsory COVID-19 reference +ve • Data sharing boost – • Impossible becomes normal • Data infrastructure investments • Mobilising rapid response -ve • Political, technical and territorial issues • Licensing, access to datasets, quality … • Short-term vs long term sustainability • Collection and governance bottlenecks
  6. 6. The compulsory what are the FAIR principles slide … in a break out box, without explanation or justification. Aspirational, not a standard. Relaunch a dialogue within the research and policy communities. Reboot a journey to wider accessibility and reusability of data. Prepare the community for automation-readiness by supporting FAIR for machines. In the paper… 15 overlapping and ambiguous …. Jacobsen et al FAIR Principles: Interpretations and Implementation Considerations, J Data Intelligence (2020) Mons et al Cloudy, increasingly FAIR; Revisiting the FAIR Data guiding principles for the European Open Science Cloud. Information Services & Use. 37. 1-8. 10.3233/ISU-170824 (2017)
  7. 7. FAIR Principles in spirit identifiers, metadata, availability, standards [adapted from Susanna Sansone] Findable Accessible Interoperable Reusable Globally unique, resolvable, and persistent identifiers ▪ To retrieve and connect data Community defined descriptive metadata that is catalogued / searchable ▪ To enhance discoverability and reusability Common terminologies and standards ▪ To use the same terms and they mean the same thing Detailed provenance ▪ To contextualize the data and facilitate reproducibility Terms of access ▪ Open as possible, closed as necessary Terms of use ▪ Clear licences, ideally to enable innovation and reuse Automation
  8. 8. FAIR Services in practice identifiers, metadata, availability, standards Findable Accessible Interoperable Reusable Persistent Identifiers Metadata Services Access Control Repositories & registries Data applications “the internet of data and services”
  9. 9. Not one size fits all FAIR is a set of guiding principles that provide for a contract of expectation between data providers and users. A continuum of features, attributes and behaviours, via many different implementations for different use cases. Communities will need to develop their own FAIR profile: • for their portfolios of data, processes, governance, policies, assessment A limited pan-discipline profile.
  10. 10. Funders Public Health Educators WHY?WHO? Data integration Reannotation
  11. 11. Credit to: A global movement rather quickly…a FAIR frenzy! Many of these projects will speak today. Researcher, clinician confusion. Community coordination cacophony
  12. 12. Movement Start up Phase: Picking apart the Principles, Inventing Indicators, Assessment and Maturity Frameworks 1. (Re)define the principles and what they mean 2. Measure the “FAIRness level” of data The target, before & after levels of data FAIRness in various “FAIRification” processes 3. Measure an organisation’s capability & performance for FAIR data generation & management Support strategic investment decisions, cost/benefit analysis, processes & monitoring, capacity building, change management for FAIR by Design DOI: 10.15497/RDA0050 Dataset maturity model Data Management Infrastructure maturity model
  13. 13. What were the principles about? Contract Compliance Certification? Judgement Endorsement Regulation Trusted Repositories Commitment of a community of providers Federation of FAIR data, registries and services Comparison, Monitoring, Review,Quality Assessment Expectation setting of a data provider Self-evaluation Awareness, Reporting Community respectful Context aware Health data - Regulation
  14. 14. What were the principles about? Contract Compliance Certification? EOSC Strategic Research & Innovation Agenda consultation 2020: metrics & certification least popular action area FAIRware The Tyranny of Metrics
  15. 15. FAIR is a Spectrum Not all data are equal, not all will be worth it Spectrum of FAIR indicators Different levels of maturity and importance to different stakeholders and communities Communities define levels, depths, coverage [Barend Mons] FAIRify • just in case • just in time • just enough Dataset portfolio
  16. 16. FAIR is not Not Fuzzy “enhancing the ability of machines to automatically find and use data or any digital object, and support its reuse by individuals” INCF Statement Not Free Not Fast Not Simple Needs experts, stewards, infrastructure, processes, maintenance … “FAIR is non-trivial, and domain specific at anything other than the most superficial level” - MarkWilkinson From high effort high gain to low effort light gain All require consensus, process change and maintenance. • Mons et al Cloudy, increasingly FAIR; Revisiting the FAIR Data guiding principles for the European Open Science Cloud. Information Services & Use. 37. 1-8. 10.3233/ISU-170824. • Dunning et alAre the FAIR Data Principles fair? IDCC17 Not One Approach Not about turning everything into RDF
  17. 17. Coverage and Implementation ChoicesVary Findable Accessible Interoperable Reusable Shallow wide, low cost, loose federation restricted value Deep narrow, tight federation harmonisation high cost, high value three-point-framework-for-fairification/
  18. 18. FAIR by Design At the start of a collection, built in throughout the life cycle change management, capacity building FAIRifying Retrospectively Legacy datasets, build a cohort, cost benefit and FAIR readiness over a collection of datasets
  19. 19. Other FAIRVariants FA(I)R Interoperability is the hardest and most costly to define, implement & maintain Interoperability is usually for a purpose not “just in case” FAIR is not about harmonising all metadata to one schema FAIR+R Reproducibility is not the same as Reusability FAIR for all digital objects – software, workflows, SOPs, models, containers, training materials … Depends on availability and metadata Containers FAIR++ Business and change analysis. Cost Benefit Analysis. Scientific / BusinessValue Quality control Impact Process maturity Sustainability FAIR for all digital objects – software, workflows, SOPs, models, containers, training materials … EC Report: Turning FAIR into Reality, 2018 scientists-need-to-know-about-fair-data, 2019 47% respondents needed greater clarification
  20. 20. Step back… Why do we need FAIR? KnowledgeTurning, Information Flow Josh Sommer, Chordoma Foundation, 2011 Promote information flow • groups and disciplines • organisational boundaries at all levels • technical infrastructure • enable federation Biomedical flow • needs to control the information flow – Ethical & governance frameworks,GDPR, consent • federation feasibility varies Fragmented and independent resources, infrastructures, governance Community enclaves. Churn which leads to knowledge loss. Scattered Fragmentation and knowledge churn
  21. 21. FAIR is not synonymous with Open respect authority and governance frameworks
  22. 22. Federated System of Catalogues and Repos federation means common agreements and compliance respecting accountability and responsibility • Connected by PIDs • Moving metadata around • Common vocabularies • Common cataloguing data elements • Term cross-walks and mappings • Shared API standards.
  23. 23. FAIR Profile for Biomedical Community Findable Accessible Interoperable Reusable Awareness of data Thin metadata sparingly shared Data visiting Not open but access controlled Limited exchange under strict governance Authentication and authorisation protocols Data visiting Ethics, consent, privacy preservation Compliance , Governance Standards and Regulation are different things Federated analysis Combining clinical, research and public health data. Different scales, collected for different purposes Standards for Usage Access • Beacon API for genomic variants • Data Use Ontology – usage restrictions • GA4GH Passports – access policies Community standards Trusted Research Environments Community implementation profiles GO-FAIR implementation networks federated
  24. 24. FAIR Data Management for Projects FAIR by Design because not everything can be fixed at the end Platform to build Project Product Hubs Projects to collaborate and get their results organised and retained • Metadata cataloguing: collection & organisation • PIDs, ontologies, metadata for machines • Controlled sharing and access • Plug into ecosystem of other registries and repositories • Data submission brokering to community repositories • Credit to contributors FAIR itself • FAIR Services are not necessarily FAIR themselves (e.g. COVID Data Portal)
  25. 25. Commons, Separated Groups
  26. 26. Investigation Study Analysis Data Model SOPAssay Metadata Provenance Standards Federated Catalogue, IntegratedView PIDs
  27. 27. Curating FAIR models
  28. 28. Mixture of shared and private objects organised by metadata
  29. 29. e.g. (Pillar III) in-house data in-house data All LiSyM Patient-related clinical data Aggregated data API External Tools API LiSyM: German Liver Systems Medicine Network FAIR but never Open [Wolfgang Mueller, HITS] Share table structure Share common code Share summaries
  30. 30. Embed into Practices National Data Infrastructure Authorised Access Secure data Spreadsheets
  31. 31. FAIR along the Pipeline Understanding the pipeline, moving metadata across resources FAIR stewardship effort and pipeline design ELIXIR RDMToolkit
  32. 32. The FAIR data infrastructure needed …. tools, services, registries & catalogues, repositories… Federation of fair repositories & registries with mixed authority models FAIR services – metadata services, ontology servers, search engines, PID services, validators, integrators, annotators, assessors, brokers … Interfaces - shared APIs, common terms, cross-walks. FAIR digital objects. FAIRsFAIR confusagram, FAIR Ecosystem Components:Vision, V2.0 10.5281/zenodo.3734273 (missing the processing of data…)
  33. 33. More than indicators, metrics and tech Infrastructure “digital technologies (hardware, software), resources (data, services, digital libraries, standards), comms (protocols, access rights, networks), people and organisational structures” Data stewardship Software Engs Professionalisation POSSIBLE Processes Organisational & Cultural change NORMATIVE Incentives Cost Benefit REWARDING Governance Regulatory Frameworks ACCEPTABLE Policy REQUIRED Education Training UNDERSTOOD Sustainability -TRUSTED
  34. 34. How is FAIR faring? Maturing… • Shifted the conversation • Concept & mobilisation frenzy • Rush to measure & certify • Community specific • FAIR not FOIR It’s now the phase to embed, sustain and yield benefits for users A marathon journey not a sprint It is not simple, but it is no longer optional
  35. 35. Acknowledgements Special thanks to • Susanna Sansone (University of Oxford, OERC) • Frederik Coppens (VIB) • Barend Mons (GO-FAIR) • Oya Deniz Beyan (Fraunhofer) • Ibrahim Emam (Imperial College) • FAIRDOM, GO-FAIR, FAIRplus and ELIXIR colleagues FAIRDOM is sponsored by