https://datascience.nih.gov/news/march-data-sharing-and-reuse-seminar 11 March 2022
Starting in 2023, the US National Institutes of Health (NIH) will require institutes and researchers receiving funding to include a Data Management Plan (DMP) in their grant applications, including the making their data publicly available. Similar mandates are already in place in Europe, for example a DMP is mandatory in Horizon Europe projects involving data.
Policy is one thing - practice is quite another. How do we provide the necessary information, guidance and advice for our bioscientists, researchers, data stewards and project managers? There are numerous repositories and standards. Which is best? What are the challenges at each step of the data lifecycle? How should different types of data? What tools are available? Research Data Management advice is often too general to be useful and specific information is fragmented and hard to find.
ELIXIR, the pan-national European Research Infrastructure for Life Science data, aims to enable research projects to operate “FAIR data first”. ELIXIR supports researchers across their whole RDM lifecycle, navigating the complexity of a data ecosystem that bridges from local cyberinfrastructures to pan-national archives and across bio-domains.
The ELIXIR RDMkit (https://rdmkit.elixir-europe.org (link is external)) is a toolkit built by the biosciences community, for the biosciences community to provide the RDM information they need. It is a framework for advice and best practice for RDM and acts as a hub of RDM information, with links to tool registries, training materials, standards, and databases, and to services that offer deeper knowledge for DMP planning and FAIR-ification practices.
Launched in March 2021, over 120 contributors have provided nearly 100 pages of content and links to more than 300 tools. Content covers the data lifecycle and specialized domains in biology, national considerations and examples of “tool assemblies” developed to support RDM. It has been accessed by over 123 countries, and the top of the access list is … the United States.
The RDMkit is already a recommended resource of the European Commission. The platform, editorial, and contributor methods helped build a specialized sister toolkit for infectious diseases as part of the recently launched BY-COVID project. The toolkit’s platform is the simplest we could manage - built on plain GitHub - and the whole development and contribution approach tailored to be as lightweight and sustainable as possible.
In this talk, Carole and Frederik will present the RDMkit; aims and context, content, community management, how folks can contribute, and our future plans and potential prospects for trans-Atlantic cooperation.
Data policy must be partnered with data practice. Our researchers need to be the best informed in order to meet these new data management and data sharing mandates.
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
RDMkit, a Research Data Management Toolkit. Built by the Community for the Community
1. www.elixir-europe.org/converge
ELIXIR-Converge has received funding from the European Union’s
Horizon 2020 Research and Innovation programme under grant
agreement No 871075.
RDMkit,
a Research Data Management Toolkit.
Built by the Community for the Community
Carole Goble, ELIXIR-UK
Frederik Coppens, ELIXIR-BE
NIH Data Sharing and Reuse Seminar, 11 March 2022
https://tinyurl.com/RDMkit-NIH
2. Data Sharing and Reuse
Nature 602, 558-559 (2022)
doi: https://doi.org/10.1038/d41586-022-00402-1
European Commission, Directorate-General for Research and Innovation, Horizon Europe, open
science : early knowledge and data sharing, and open collaboration, Publications Office, 2021,
https://data.europa.eu/doi/10.2777/79699
3. Data Sharing and Reuse
https://www.go-fair.org/fair-principles/
Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3,
160018 (2016). https://doi.org/10.1038/sdata.2016.18
4. Where to start when writing
and executing a
Data Management Plan?
Support is overwhelming – so many
data initiatives!
Support is underwhelming – much
of organisation advice too generic.
Bioscientists
Researchers
Data
Stewards
Investigators,
Lab Managers,
Project Coordinators
Funders,
Policy
Makers
5. ELIXIR Europe
Federated inter-governmental
organisation building a distributed
European Research Infrastructure
for Life Science Data
Supporting FAIR data management
for the diversity of life sciences
across Europe
23 Nodes
250+
organisations
https://elixir-europe.org
7. ELIXIR’s support for FAIR Data
Specific communities
Human Data, Structural Bioinformatics, Rare
Diseases, Plant Sciences, Microbial Biotechnology ...
FAIR services & resources
Registries, standards, ontologies, identifiers,
data management platforms, stewardship tools,
templates.
Trusted repositories
Deposition databases and portals, scalable
curation, sustainability.
FAIR data techniques
Workflows, reproducible processing, transparent
reporting and provenance, FAIR assessment and
evaluation, FAIRification methods.
FAIR policy and activism
FAIR principles, FAIR leadership & partnering at
the global, European and national level.
FAIR expertise and training
Capability frameworks, skills, data managers
network, training portal.
8. FAIR Data
Landscape
enable end to end FAIR
Public General Repositories
Institutions
Institutional
Repositories
My filestore,
my institution’s file store, ELNs
Data submission
and access
pipelines
National Nodes
Specialised and National
RDM Platforms and
Repositories
SARS-CoV-2 Data Hubs
Metadata and
preparation
Trusted Repositories
9. How can we help researchers, data stewards
and project managers navigate and contribute
to this FAIR data repository landscape?
10. Provides guidance on the
Research Data Management support landscape
https:/rdmkit.elixir-europe.org
11. ELIXIR-CONVERGE
distributed local support for data management
A web-based toolkit
for the bioscience community
written by the bioscience community
*https://zenodo.org/record/3474630#.YP2jIEDTXZQ
Data Expert network
Training and Capacity Building
Competency Frameworks*
Professionalising Data Stewardship
Training & Training materials
Data brokering pipelines
From project data platforms
to ELIXIR Deposition Databases
Network of people in and
across members states
Best practice & examples
The European COVID-19 Data Platform
SARS-CoV-2 Data Hubs
Federated European Genome-
Phenome Archive
SARS-CoV-2 variant surveillance data tracking
services and tools
12. RDM support
throughout the entire
life cycle of projects as
outlined in DMPs
Online focal point for
guidance, information, best
practice, examples
Share
Reuse
Preserve
Analyse
Process
Plan
Collect
https://rdmkit.elixir-europe.org
Context and signpost for FAIR
data resources as a Hub for a
RDM Knowledge Commons
Sensitive Data
Toolbox
13. 7719 users
52927 views
123 countries
Launched March 2021
Every member state + other initiatives contributed
Recommended by European Commission and
national funders
USA is top user!
https://rdmkit.elixir-europe.org
21. Registries, Services & Tools placed in narrative context
training materials,
learning paths,
events
tools
standards,
databases, policies,
organised into
collections
22. Hookup with Specialised, Complementary Resources
More Detailed Expert Guidance, Smart Linking
Detailed recipes for
making FAIR data
FAIR Data
Stewardship Guidance
https://ds-wizard.org/
https://fairplus.github.io/the-fair-cookbook/
23. Data Stewardship Wizard
Predetermined Knowledge Models
drive questionnaires to create
DMPs
https://rdmkit.elixir-europe.org/human_data.html
https://converge.ds-wizard.org/knowledge-models/dsw:root-
rdmkit:latest/preview?questionUuid=d5990471-0618-42cd-92cb-
bbbfd4f61532
RDMkit provides context to the
Knowledge Models in the DSW
https://ds-wizard.org/
Smart Auto Linking
24. FAIR Cookbook Recipes
Expert recipes for
FAIRifying data according
to the FAIR elements
RDMkit provides context to the Recipes in the
FAIR Cookbook
https://fairplus.github.io/the-fair-cookbook/
25. A Showcase for ELIXIR’s FAIR Services
Data Management Systems for
user projects, supported by Nodes
Public Data Repositories
Standards
Expert Data
Stewardship
decision making
Expert Data
FAIRification
FAIR Services
Sensitive Data
Toolbox
Registries
26. Under the Hood Infrastructure
designing to scale and sustain
The simplest, widest used, and familiar
platform, sustainable with limited resources
• Freely accessible and open source .
• GitHub, GitHub-pages, Markdown, Jekyll, YAML, Tags.
Static pages limitation overcome with a really
nifty and disciplined content tagging scheme
• Auto-link with registries.
Aligned with community practice
• Inspired by The Turing Way Handbook.
https://github.com/elixir-europe/rdmkit
27. Open Contribution and Editorial Processes
designing to scale and sustain, catering for different skills
Google
Docs
Quick
submit
28. Open Contribution and Editorial Processes
catering for different skills, designing to scale and sustain
Author Editor Reviewer
Proposed and Commissioned content
• Guides, tools and resources, registry entries
• Links with Resources (Cookbook, DSW …)
Contentathons and focus groups
Open authoring and review.
Versioning & contributor credit.
Governance.
29. Contributors Credit
For all contributions and for every page
Roles and Processes
Start-up -> maintenance
contribution transitions
Attribution on evolving
content
Light touch curation
support & Templates
Code of Conduct
30. The RDMkit Community and Team Science
mobilised using Open Research Practices
Started in March 2020.
Beta Launched March 2021.
Entirely developed by virtual
collaboration.
Probably accelerated
development – democratic,
rapid, open.
31. Start up, Co-creation and Cultivation
Established ELIXIR framework of
national nodes and communities
Established core collaboration of folks
who had been to the bar together
Seed project funding
Adoption of open development and
open science methods
• MVP, user persona, stories & scenarios,
focus groups
• Hackathons & contentathon
• Open content Healthy Community Checklist,
Anna Maglia, UMass Lowell (extended)
❑ Shared Vision
❑ Leadership
❑ Communication / Trust
❑ Democratic Processes
❑ Clear Roles
❑ Goal driven / Regular Returns
❑ Growth / Vibrancy
❑ Standards and Processes
❑ Discovery Enabling
❑ Necessary Resources
32. Cultivation and Maintenance
Refreshing the Community
• Other Research Infrastructures and Communities
• Other countries – Go USA!
Refreshing the Content
• Processes and roles for review and update
• Mechanisms for adding new resources
• Automated stale content detection
• Reader feedback and correction
Refreshing the People
• Burn-out (Wikipedia 90:9:1 rule)
• Credit & Positive Feedback
33. Infrastructure & Processes as a Platform
Separation of functionality & branding, content
Underlying ELIXIR toolkit theme
https://github.com/ELIXIR-Belgium/elixir-toolkit-theme
Available on GitHub, GitHub packages & RubyGems (e.g. for
Gitlab)
Open
Documents and data: a CC-BY license.
Software available: MIT license.
34. Adoption of the approach : Infectious Disease Toolkit
COVID pandemic exposed lack of (cross-domain) interoperability of data &
services with many developments across institutes, countries, …
Infectious Diseases Toolkit to provide overview & context
Link with RDMkit for data management aspects
BY-COVID is funded by the European Union’s Horizon Europe research and innovation
program under grant agreement number 101046203. https://by-covid.org
35. The Future – Snowballing & Sustainability
• Domains health, cancer, COVID, rare diseases, workflows, farm animals…
• Communities biodiversity (ERGA), infectious diseases (BY-COVID), …
• Tool assemblies data brokering, TREs, RDM blueprint, …
• Roles project coordinator, clinical researchers …
• National pages USA?
Content
Expansion
Processes
Refine for
contribution, keeping
data relevant and
ensuring sustainability
Adoption & Embedding
ELIXIR Nodes, Data Expert Group,
Projects, Programmes, Nations
Missions
Develop a
“RDM Knowledge Commons”
RDMkit, FAIR Cookbook, Data
Stewardship Wizard, …
36. Build Partnerships - USA!
• Content expansion
• Health Data …
• Tools, registries …
• Showcase USA FAIR services, know-
how & capacity building
• Dedicated national pages
• For different areas use the
platform and processes
• Projects & Programmes Adoption
• Sustainability partner
37. End to end FAIR Data takes a village*
Policy and Infrastructure not enough
We need to offer Assistance before we make Assessments
build capacity and
skills for researchers,
stewards and data
providers
support researchers to
know, utilise, enable
and demand FAIR RDM
services
pool the expertise of
the community for
the community
* Borgman and Bourne, Why it takes a village to manage and share data (2021), https://arxiv.org/abs/2109.01694
38. Acknowledgments
All the ELIXIR-CONVERGE project
All the RDMkit team & editorial board
All our contributors and partners
For more information
https://rdmkit.elixir-europe.org
ELIXIR-Converge has received funding from the European Union’s
Horizon 2020 Research and Innovation programme under grant
agreement No 871075.