SlideShare una empresa de Scribd logo
1 de 64
Descargar para leer sin conexión
FAIR BioData Management
Ulrike Wittig
Heidelberg Institute forTheoretical Studies, Germany
- Where do you store your experimental data?
- What happens with data when a PhD students leaves the group?
- Are all data complete for a publication?
- Do you make regular backups of your local machine?
- Do you send emails to share data with your colleagues?
- Do you always store email attachements in your local directory?
- Do you store all different versions of a data file together in the same place?
- Which protocol was used for the experiment?
...
Why do you need data management?
Vahan Simonyan, Center for Biologics
Evaluation and Research, Food and
DrugAdministration, USA
How well is your experiment
documented?
• Track collection of raw and processed
(secondary) data, models & metadata
• Maintain experimental context
• Organise and link assets
• Choose what to keep and what to ditch
• Report consistently
• Reproducible publications
• Promote standardised metadata practices
• Exchange among colleagues
• How and when to share and publish
• Get and give credit
• Retain and find beyond project
• Integrate with legacy, home grown,
external systems
• Reuse tools and community archives
• Support automation and analytics
workflows. Support curation
CREATING
DATA
PROCESSING
DATA
ANALYSING
DATA
PRESERVING
DATA
ACCESS
TO DATA
RE-USING
DATA
Purpose of Project Data
Management
Purpose of Project Data
Management
Organisation
Communication
Dissemination
Partners
Funders
Public
The FAIR Guiding Principles for scientific data management and stewardship
https://www.nature.com/articles/sdata201618 (2016)
FAIR Principles
FAIR ≠ FREE
FAIR Checklists
Making Data Findable (documentation and metadata management)
• What documentation and metadata will accompany the data (assist its
discoverability)? (Details on methodology, definitions, procedures, SOPs,
vocabularies, units, dependencies, etc)
• What information is needed for the data to be read and interpreted in the
future?
• What naming conventions will be used?
• How will you approach versioning your data?
• How will you capture / create this documentation and metadata?
• How do you ensure the completeness of the captured data?
Making Data Accessible
Specify which data will be made openly available taking into consideration
• What ethics and legal compliance issues do you have if any? Do you need
consent for data preservation and sharing? Do you have to protect certain
data? Is any data sensitive?
• Do you think you might have Intellectual Property Rights issues? Have you
considered ownership of the data, licensing, restrictions on use?
• Do you think you will need to embargo any data?
• How will you make the data available? (consider the platforms you will use:
databases, repositories, etc)
• What methods or software tools are needed to access the data? shoudl you
include documentation detailing how to access use/access the software that is
needed for accessing the data? Is it possible to include this software with the
data (e.g. source code, docker etc)
• If there are any restrictions on accessibility, how will you provide access?
Making Data Interoperable
• What standards (metadata vocabularies, formats,
checklists) or methodologies will you use?
• How do you address data and model quality?What
validation steps do you foresee?
• Will you use standardised vocabulary for all data types
to allow inter-disciplinary interoperability?
• Where you can not used standardised vocabulary for all
types of data, can you map to more commonly used
ontologies?
Making Data Re-usable
• How will you licence your data to permit the widest re-
use possible?
• When will the data be made available for re-use? Does
this include an embargo period? (if so, why?)
• Which data will be available for re-use during/after the
project? If not, why?
• What are your data quality assurance processes?
• How long do you expect your data to remain re-usable?
FAIRDOM Initiative
- develop a community
- establish an internationally sustained Data and Model Management service
- joint action of ERA-Net EraSysAPP and European Research Infrastructure ISBE
A bit of history : 11Year Anniversary
2008
2010
2014
2018
2012
2016
2020
Standards based asset
management (data,
models, workflows,
SOPs…) for multi-party
projects
Sensitive sharing
Self-deposit / curation
Mixed stewardship skills
Legacy local systems
Community resources
Started in Systems
Biology. Now widened.
SEEK Software
- Open source web platform for sharing scientific research assets, processes
and outcomes
- Associations between data along with information about the people and
organisations (yellow pages)
- ISA (Investigation, Study, Assay) structure for describing how individual
experiments are aggregated into studies and investigations
- Flexible and detailed sharing permissions
- DOI can be generated for individual items, or entire aggregates
- Semantic technology, allowing sophisticated queries over the content
- Collection of meta data
https://seek4science.org/
Models
Data files
SOP Standard
Operating
Procedures
Documents
People
Programmes
Projects
Publications
SEEK Software
PresentationsEvents
Samples
Catalogue of distributed data
Personal Data
Local Stores
External
Databases
Articles
Models
Standards
SOPs
Investigation
Study Analysis
Data
Model
SOP(Assay)
https://fairdomhub.org/investigations/56
Investigation - Study - Assay
https://fairdomhub.org/investigations/56
Investigation:
Glucose metabolism in P.
falciparum trophozoites
Study:
Model construction
Study:
Model validation
Assay: LDH
Assay: PK
Assay: ENO
Assay: PGM
Assay: PGK
Assay: GAPDH
Assay: TPI
Assay: ALD
Assay: PFK
Assay: PGI
Assay: HK
Assay: GLCtr
Assay: PYRtr
Assay: LACtr
Assay: G3PDH
Assay: GLYtr
Assay: ATPase
Data: GLCtr
Model: GLCtr
Data: HK
Model: HK
Steady state
Incubation
penkler1
Validation data
penkler2
Validation data
...
...
SOP: GLCtr
SOP: HK
...
SOP: Validation
Assay: Culturing
Assay: Lysate prep.
SOP: Culturing
SOP: Lysate prep.
Design an ISA
Investigation - Study - Assay
People -Yellow Pages
Data Files, SOPs, Documents
- no file format restrictions
- some formats allow to view the content in SEEK: e.g.Excel,Word, PDF, XML, PNG
Models
SBML Model simulation
Model comparison
Model versioning
Reproducing simulations
[Jacky Snoep, Dagmar Waltemath,
Martin Peters, Martin Scharm]
Tracking versi0ns
Tracking model versions smartly
Scharm, M.,Wolkenhauer, O., &Waltemath, D. (2015). An algorithm to detect and communicate the differences in
computational models describing biological systems. Bioinformatics
SpreadsheetTemplates
Embed ontologies into
Excel templates
Excel spreadsheets enriched
with ontology annotations
Upload, extract metadata and register
http://www.rightfield.org.uk
Samples
Generation of templates for sample types
Sample extraction from spreadsheets
HTP sample referencing and
metadata migration
Data Sharing in SEEK
Publishing in SEEK
Publishing in SEEK - DOI
https://fairdomhub.org/investigations/56
Publishing in SEEK
Fix state with particular
versions
Active entry continues to
evolve
Assign a DOI
DOI in Publication
More than simple supplementary materials
16 datafiles (kinetic, flux inhibition, runout)
19 models (kinetics, validation)
13 SOPs
3 studies (model analysis, construction,
validation)
24 assays/analyses (simulations, model
characterisations)
Penkler, G., du Toit, F., Adams, W., Rautenbach, M.,
Palm, D. C., van Niekerk, D. D. and Snoep, J. L. (2015),
Construction and validation of a detailed kinetic model
of glycolysis in Plasmodium falciparum. FEBS J, 282:
1481–1511. doi:10.1111/febs.13237
Scharm M,Wendland F, Peters M,Wolfien M,TheileT,Waltemath D
SEMS, University of Rostock
zip-like file with a manifest & metadata
- Bundling files - Keeping provenance
- Exchanging data - Shipping results
Bergmann, F.T.,Adams, R., Moodie, S., Cooper, J., Glont, M., Golebiewski, M., ... & Olivier, B. G. (2014). COMBINE archive and OMEX format:
one file to share all information to reproduce a modeling project. BMC bioinformatics,15(1), 1.
Packaging: COMBINEarchive
Standards-based metadata framework for
bundling (scattered) resources with context and citation
Packaging: Research Objects
http://researchobject.org
SEEK as project-specific local
instances or as central FAIRDOMHub
Service hosted at HITS
(Institutional Guarantee at least until 2029)
FAIRDOMHub Statistics
1st July 2019
Programmes 60
Projects 144
Institutions 274
People 1291
Data files 2280
Models 487
SOPs 301
Sample types 63
Presentations 729
Publications 370
Events 178
FAIRDOM Platform
Free and Open Source
Front end
Project(s) Hub
Back end
Onsite storage & analytics
On site
Tracking, data analytic pipelines,
Extract,Transform and Load direct from the
instruments, large data management
LIMS, auto-archiving
Web-based portal
Project controlled spaces
Metadata catalogue &Yellow pages
Results repository, dissemination and collaboration
Tool gateway
Built using Built using
Back end
Instrument Data Management, LIMS, ELN
• samples
• protocols
• instruments
• data management
• experimental description
Norway’s national e-Infrastructure
for Life Science
https://nels.bioinfo.no/
Electronic Laboratory Notebook and Laboratory
Information Management System (ELN-LIMS)
https://csb.ethz.ch/tools/software/openbis-lims-eln.html
[Adapted from Ursula Klingmüller, Martin Böhm]
Excemplify
Antibody
Database
FAIR collaboration
from the ERANet ERASysAPP
38
Programme
Overarching research theme (The Digital Salmon)
Project
Research grant (DigiSal, GenoSysFat)
Investigation
A particular biological process, phenomenon or thing
(typically corresponds to [plans for] one or more closely related
papers)
Study
Experiment whose design reflects a specific biological research
question
Assay
Standardized measurement or diagnostic experiment using a
specific protocol
(applied to material from a study)
Jon Olav Vik,
Norwegian University of Life Science
Integration with Norway’s national
einfrastructure for Life Science (NeLS)
• Project controlled protected spaces
– Working space, show space for results
– Supp. materials space for publications
– Yellow pages and collaboration
– Upload or link to data
• One place catalogue
– Regardless of physical store
– ISA with shared metadata
– Standards-compliant
• Linked with other systems
– Project on-site (secure) repositories
– Public deposition archives
– Integrated with JWSOnline modelling tools
Front End
Find, Access and Organise assets
“Using FAIRDOMHub my own
lab colleagues saw what I was
doing and called to
collaborate!”
https://fairdomhub.org/
Catalogue across repositories regardless
of location
In House Stores
External Databases
Publishing services
Secure Stores
Model Resources
Upload or
Reference
Active and Published Data
Metadata Exchange along the Pipeline
ELNs
PALs - Project Area Liaisons
PALs
DM Team
Data management training
Requirements & Suggestions
• Training needs for users
• Suggestions to improve SEEK
• Requirements for new SEEK
features and DM services
PALs - Project Area Liaisons
- our user focus group
- post docs, postgrads and techs
- experimentalists, modellers and bioinformaticians
- advocates and communicate our progress back to their projects
Data Stewards
function, profession, cultural shift
• 500,000 needed in Europe*
• Specialist skills
• Career pathways
• Recognition
Curation and management
• Supported, Resourced
• Recognised, Rewarded
Sharing policy and practice embedded
* Realising the Open European Science Cloud (2016)
Stewardship Support
Independent
researchers
Facilities
Centres
Projects
Programmes
Infrastructures
Different Users, Different Use
LiSyM (Liver Systems Medicine)
German Research Network on
Systems Medicine for Liver Disease
Supported by
The German Federal Ministry of Education and Research 2016-2020
Multiple disciplines
• Medicine
• Biology, Biochemistry
• Pharmacology
• Physics
• Bioinformatics
• Data management
• Industry
38 independent research groups:
• Bayer AG
• Max Planck (Dresden and Berlin)
• MEVIS Fraunhofer (Bremen)
• Leibniz Institute IfaDo (Dortmund)
• Charité (Berlin)
• DKFZ (Heidelberg)
• Hospitals: Dresden, Kiel,Aachen, Homburg,
Berlin, Heidelberg, Munich
• + 18 Universitieshttp://www.lisym.org/
LiSyM Data Management
Clinical data sharing concept
Goal:
• Diffuse description of data
throughout consortium
Challenge:
• Some partners cannot share
Solution:
• Share table structure
• Create & share common code
• Distributedly create summaries
NMTrypI
Trypanosomiasis causes sleeping
sickness, leishmaniasis and Chagas
disease - in Africa, South America and
India
EU-funded project 2014 – 2017
Goal: new candidate drugs against
Trypanosomatidic infections
Consortium: 12 partners (3 SMEs and 9
academics) in Europe and in disease-
endemic countries (Italy, Greece, Portugal,
Spain, Germany, UK, Sudan, Brazil)
https://fp7-nmtrypi.eu
NMTrypI specific challenges
• New visualizations of spreadsheet data
• Cross-references with external databases
• Chemical compound specific features
– show structure
– allow (sub)structure search
– create compound summary reports
xxxxx
Visualization of enzyme inhibition by different compounds (in %)
Heat map + Parallel coordinates plot
xxxxxxxxxxx xxxxxxxxxx xxxxxx xxxxxx
xxxx
xxxx
xxxx
xxxx
xxxx
xxxx
xxxx
xxxx
xxxx
xxxx
xxxx
xxxx
xxxx
xxxx
xxxx
xxxx
xxxx
xxxx
xxxx
xxxx
xxxx
Automatic detection of UniprotIDs in Excel-table and
link to UniprotKB and StringDB
Chemical compound specific features
de.NBI -The German Network for
Bioinformatics Infrastructure
de.NBI consortium
• 39 project partners
• 30 institutions
• 8 service centers
https://www.denbi.de/
Mission
• Provide, expand and improve specialized
bioinformatics tools
• Provide access to computing and storage
capacities
• Provide regular training events and workshops
• Maintain and develop specific high-quality data
resources
Research and service topics of de.NBI service centers
HD-HuB
Bioinformatics Infrastructures in Biomedical Research
• Human genetics and genomics
• Metagenomics
• Systematic phenotyping of human cells
• Epigenetics
BiGi
Microbial Research for Biotechnology and Medicine
• High performance computing services
• Repository of reusable workflows
• Comparative genomics and meta-omics
• Post-genomics data integration
BioData
Reference Databases, Services and Tools
• Ribosomal RNAs (SILVA)
• Environmental data (PANGAEA)
• Taxon-associated metadata (BacDive)
• Enzymes & Ligands (BRENDA/EnzymeStructures)
CIBI
Tools for omics data and imaging
• Open-source libraries (OpenMS, SeqAn, FIJI)
• Tools for NGS, mass spec, and imaging
• Workflow engine (KNIME) for automation
• (Multi-)omics data analysis workflows
RBC
RNA Bioinformatics
• Analysis of RNA-related data
• Life science data analysis with Galaxy
• Meta-transcriptomics
• Epigenetic research
de.NBI-SysBio
Standards-based Systems Biology
• Data and model management tools
• SABIO-RK reaction kinetics data
• Methods and tools for modeling in Systems Biology
• Standards & tools for model search and management
GCBN
Crops and BioGreenFormatics
• Plant genetic resources and traits
• Bridging genotypes to phenotypes
• Plant gene and genome annotation
• Enabling technologies to improve crops
BioInfra.Prot
Bioinformatics for Proteomics
• Comprehensive proteomics workflow
• Data publication, analysis & tool services
• Quality standards for targeted proteomics
• Lipidomics
de.NBI -The German Network for
Bioinformatics Infrastructure
Current Actions in de.NBI
• Goal: Make Data FAIRness part of all de.NBI centers
• Idea: Have service centers collect more metadata. No metadata, no
service.
• Approach: Build use cases that involve data management and service
centers
Two example use cases: Medical proteomics center
• Statistical advice service
– tracking of advice given
– making reports FAIR
• From data to PRIDE
– Catalogue links to PRIDE in SEEK/FAIRDOMHub
– Store and standardise intermediate files
Summary FAIRDOM
FAIRDOM Software Platform+Tools
A Central Public Hub
for Projects
Customised Project
Installations
Project Stewardship
Consultancy Services
Community
Activities
144 Projects 30+ Installations
Summary FAIRDOM
Find & Access Central catalogue
Link to original files and external resources
Search
Metadata tagging and standards
Yellow pages of projects and people
Access control to spaces
Embedded tools
Interoperate Rich metadata, standards compliance
Consistent reporting – ISA
Curation support
Integration with other resources, archives, tools
Export packages
Reuse Secure sharing space
Long term retention
Reproducible publication
- Where do you store your experimental data?
- What happens with data when a PhD students leaves the group?
- Are all data complete for a publication?
- Do you make regular backups of your local machine?
- Do you send emails to share data with your colleagues?
- Do you always store email attachements in your local directory?
- Do you store all different versions of a data file together in the same place?
- Which protocol was used for the experiment?
...
Why do you need data management?
What can you do? Be FAIR!
1. make a Data Management Plan
2. use standard identifiers
3. use metadata standards
4. catalogue / register data with metadata
5. define and share your SOPs
6. use data (assets) management platforms and tools that work
together
7. deposit into public archives
8. have a sustainability / end project plan
9. resource and support, and that means people too
10. embed data management into work practices and do some
training
11. give credit
12. check if you have sensitive data issues
13. educate your supervisors, institutions and peers
FAIRDOMTeam
Thanks to our sponsors, partners and
collaborators
Thank you!
https://fair-dom.org/
Questions?
ulrike.wittig@h-its.org

Más contenido relacionado

La actualidad más candente

Data Access & Storage @ UWA - UWA Research Week September 2017
Data Access & Storage @ UWA - UWA Research Week September 2017Data Access & Storage @ UWA - UWA Research Week September 2017
Data Access & Storage @ UWA - UWA Research Week September 2017Katina Toufexis
 
HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9 HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9 Scott Edmunds
 
Linked Data for Biopharma
Linked Data for BiopharmaLinked Data for Biopharma
Linked Data for BiopharmaTom Plasterer
 
Basics of Research Data Management
Basics of Research Data ManagementBasics of Research Data Management
Basics of Research Data ManagementOpenAIRE
 
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...Amanda Whitmire
 
Data repositories -- Xiamen University 2012 06-08
Data repositories -- Xiamen University 2012 06-08Data repositories -- Xiamen University 2012 06-08
Data repositories -- Xiamen University 2012 06-08Jian Qin
 
Better Software, Better Research
Better Software, Better ResearchBetter Software, Better Research
Better Software, Better ResearchCarole Goble
 
Ala cspace aspace rep services demo 2015
Ala cspace aspace rep services demo 2015Ala cspace aspace rep services demo 2015
Ala cspace aspace rep services demo 2015LYRASIS
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout Carole Goble
 
Open Access: Open Access Looking for ways to increase the reach and impact of...
Open Access: Open Access Looking for ways to increase the reach and impact of...Open Access: Open Access Looking for ways to increase the reach and impact of...
Open Access: Open Access Looking for ways to increase the reach and impact of...librarianrafia
 
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...Tom Plasterer
 
RDA FAIR Data Maturity Model
RDA FAIR Data Maturity ModelRDA FAIR Data Maturity Model
RDA FAIR Data Maturity ModelOpenAIRE
 
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesApplication of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesPistoia Alliance
 
Introduction to Data Management
Introduction to Data ManagementIntroduction to Data Management
Introduction to Data ManagementAmanda Whitmire
 
How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?Jian Qin
 
Dataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* DataDataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* DataTom Plasterer
 
Donders neuroimage toolkit - open science and good practices
Donders neuroimage toolkit -  open science and good practicesDonders neuroimage toolkit -  open science and good practices
Donders neuroimage toolkit - open science and good practicesRobert Oostenveld
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational WorkflowsCarole Goble
 

La actualidad más candente (20)

Data Access & Storage @ UWA - UWA Research Week September 2017
Data Access & Storage @ UWA - UWA Research Week September 2017Data Access & Storage @ UWA - UWA Research Week September 2017
Data Access & Storage @ UWA - UWA Research Week September 2017
 
HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9 HKU Data Curation MLIM7350 Class 9
HKU Data Curation MLIM7350 Class 9
 
Linked Data for Biopharma
Linked Data for BiopharmaLinked Data for Biopharma
Linked Data for Biopharma
 
Basics of Research Data Management
Basics of Research Data ManagementBasics of Research Data Management
Basics of Research Data Management
 
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
 
Data repositories -- Xiamen University 2012 06-08
Data repositories -- Xiamen University 2012 06-08Data repositories -- Xiamen University 2012 06-08
Data repositories -- Xiamen University 2012 06-08
 
Better Software, Better Research
Better Software, Better ResearchBetter Software, Better Research
Better Software, Better Research
 
Ala cspace aspace rep services demo 2015
Ala cspace aspace rep services demo 2015Ala cspace aspace rep services demo 2015
Ala cspace aspace rep services demo 2015
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout
 
Open Access: Open Access Looking for ways to increase the reach and impact of...
Open Access: Open Access Looking for ways to increase the reach and impact of...Open Access: Open Access Looking for ways to increase the reach and impact of...
Open Access: Open Access Looking for ways to increase the reach and impact of...
 
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
Harnessing Edge Informatics to Accelerate Collaboration in BioPharma (Bio-IT ...
 
Preparing Your Research Data for the Future - 2015-06-08 - Medical Sciences D...
Preparing Your Research Data for the Future - 2015-06-08 - Medical Sciences D...Preparing Your Research Data for the Future - 2015-06-08 - Medical Sciences D...
Preparing Your Research Data for the Future - 2015-06-08 - Medical Sciences D...
 
RDA FAIR Data Maturity Model
RDA FAIR Data Maturity ModelRDA FAIR Data Maturity Model
RDA FAIR Data Maturity Model
 
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data ResourcesApplication of recently developed FAIR metrics to the ELIXIR Core Data Resources
Application of recently developed FAIR metrics to the ELIXIR Core Data Resources
 
Introduction to Data Management
Introduction to Data ManagementIntroduction to Data Management
Introduction to Data Management
 
How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?How Portable Are the Metadata Standards for Scientific Data?
How Portable Are the Metadata Standards for Scientific Data?
 
Dataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* DataDataset Catalogs as a Foundation for FAIR* Data
Dataset Catalogs as a Foundation for FAIR* Data
 
Donders neuroimage toolkit - open science and good practices
Donders neuroimage toolkit -  open science and good practicesDonders neuroimage toolkit -  open science and good practices
Donders neuroimage toolkit - open science and good practices
 
Research Data Management: An Overview - 2014-05-12 - Humanities Division, Uni...
Research Data Management: An Overview - 2014-05-12 - Humanities Division, Uni...Research Data Management: An Overview - 2014-05-12 - Humanities Division, Uni...
Research Data Management: An Overview - 2014-05-12 - Humanities Division, Uni...
 
FAIR Computational Workflows
FAIR Computational WorkflowsFAIR Computational Workflows
FAIR Computational Workflows
 

Similar a FAIR BioData Management

Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer SchoolCarole Goble
 
FAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM
 
Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...
Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...
Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...OSTHUS
 
Open@Fao presentation at the EADI Open For Development Project, 2012
Open@Fao presentation at the EADI Open For Development Project, 2012 Open@Fao presentation at the EADI Open For Development Project, 2012
Open@Fao presentation at the EADI Open For Development Project, 2012 Stephen Katz
 
Pemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptxPemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptxelisarosa29
 
Data Management Planning for researchers
Data Management Planning for researchersData Management Planning for researchers
Data Management Planning for researchersSarah Jones
 
Reproducible research: theory
Reproducible research: theoryReproducible research: theory
Reproducible research: theoryC. Tobin Magle
 
Converged IT and Data Commons
Converged IT and Data CommonsConverged IT and Data Commons
Converged IT and Data CommonsSimon Twigger
 
SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...Carole Goble
 
Introduction to FAIRDOM
Introduction to FAIRDOMIntroduction to FAIRDOM
Introduction to FAIRDOMCarole Goble
 
John morrissey c3 dis fair working data.pptx
John morrissey c3 dis fair working data.pptxJohn morrissey c3 dis fair working data.pptx
John morrissey c3 dis fair working data.pptxARDC
 
Data Management for Undergraduate Researchers
Data Management for Undergraduate ResearchersData Management for Undergraduate Researchers
Data Management for Undergraduate ResearchersRebekah Cummings
 
UK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalfaceUK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalfaceLizLyon
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumAnita de Waard
 
AgriFood Data, Models, Standards, Tools, Use Cases
AgriFood Data, Models, Standards, Tools, Use CasesAgriFood Data, Models, Standards, Tools, Use Cases
AgriFood Data, Models, Standards, Tools, Use CasesRothamsted Research, UK
 
e-Science, Research Data and Libaries
e-Science, Research Data and Libariese-Science, Research Data and Libaries
e-Science, Research Data and LibariesRob Grim
 

Similar a FAIR BioData Management (20)

Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 
FAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech ProposalsFAIRDOM data management support for ERACoBioTech Proposals
FAIRDOM data management support for ERACoBioTech Proposals
 
Research-Data-Management-and-your-PhD
Research-Data-Management-and-your-PhDResearch-Data-Management-and-your-PhD
Research-Data-Management-and-your-PhD
 
Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...
Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...
Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...
 
Open@Fao presentation at the EADI Open For Development Project, 2012
Open@Fao presentation at the EADI Open For Development Project, 2012 Open@Fao presentation at the EADI Open For Development Project, 2012
Open@Fao presentation at the EADI Open For Development Project, 2012
 
Intro to RDM
Intro to RDMIntro to RDM
Intro to RDM
 
Demography pro sem
Demography pro semDemography pro sem
Demography pro sem
 
Pemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptxPemanfaatan Big Data Dalam Riset 2023.pptx
Pemanfaatan Big Data Dalam Riset 2023.pptx
 
Data Management Planning for researchers
Data Management Planning for researchersData Management Planning for researchers
Data Management Planning for researchers
 
Reproducible research: theory
Reproducible research: theoryReproducible research: theory
Reproducible research: theory
 
Converged IT and Data Commons
Converged IT and Data CommonsConverged IT and Data Commons
Converged IT and Data Commons
 
SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...
 
Introduction to FAIRDOM
Introduction to FAIRDOMIntroduction to FAIRDOM
Introduction to FAIRDOM
 
John morrissey c3 dis fair working data.pptx
John morrissey c3 dis fair working data.pptxJohn morrissey c3 dis fair working data.pptx
John morrissey c3 dis fair working data.pptx
 
Data Management for Undergraduate Researchers
Data Management for Undergraduate ResearchersData Management for Undergraduate Researchers
Data Management for Undergraduate Researchers
 
UK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalfaceUK Digital Curation Centre: enabling research data management at the coalface
UK Digital Curation Centre: enabling research data management at the coalface
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
 
AgriFood Data, Models, Standards, Tools, Use Cases
AgriFood Data, Models, Standards, Tools, Use CasesAgriFood Data, Models, Standards, Tools, Use Cases
AgriFood Data, Models, Standards, Tools, Use Cases
 
e-Science, Research Data and Libaries
e-Science, Research Data and Libariese-Science, Research Data and Libaries
e-Science, Research Data and Libaries
 
Strasser "Effective data management and its role in open research"
Strasser "Effective data management and its role in open research"Strasser "Effective data management and its role in open research"
Strasser "Effective data management and its role in open research"
 

Último

Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSSLeenakshiTyagi
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticssakshisoni2385
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...anilsa9823
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)Areesha Ahmad
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
fundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyfundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyDrAnita Sharma
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 

Último (20)

Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
Lucknow 💋 Russian Call Girls Lucknow Finest Escorts Service 8923113531 Availa...
 
The Philosophy of Science
The Philosophy of ScienceThe Philosophy of Science
The Philosophy of Science
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
fundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomologyfundamental of entomology all in one topics of entomology
fundamental of entomology all in one topics of entomology
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 

FAIR BioData Management

  • 1. FAIR BioData Management Ulrike Wittig Heidelberg Institute forTheoretical Studies, Germany
  • 2. - Where do you store your experimental data? - What happens with data when a PhD students leaves the group? - Are all data complete for a publication? - Do you make regular backups of your local machine? - Do you send emails to share data with your colleagues? - Do you always store email attachements in your local directory? - Do you store all different versions of a data file together in the same place? - Which protocol was used for the experiment? ... Why do you need data management?
  • 3. Vahan Simonyan, Center for Biologics Evaluation and Research, Food and DrugAdministration, USA How well is your experiment documented?
  • 4. • Track collection of raw and processed (secondary) data, models & metadata • Maintain experimental context • Organise and link assets • Choose what to keep and what to ditch • Report consistently • Reproducible publications • Promote standardised metadata practices • Exchange among colleagues • How and when to share and publish • Get and give credit • Retain and find beyond project • Integrate with legacy, home grown, external systems • Reuse tools and community archives • Support automation and analytics workflows. Support curation CREATING DATA PROCESSING DATA ANALYSING DATA PRESERVING DATA ACCESS TO DATA RE-USING DATA Purpose of Project Data Management
  • 5. Purpose of Project Data Management Organisation Communication Dissemination Partners Funders Public
  • 6. The FAIR Guiding Principles for scientific data management and stewardship https://www.nature.com/articles/sdata201618 (2016)
  • 8. FAIR Checklists Making Data Findable (documentation and metadata management) • What documentation and metadata will accompany the data (assist its discoverability)? (Details on methodology, definitions, procedures, SOPs, vocabularies, units, dependencies, etc) • What information is needed for the data to be read and interpreted in the future? • What naming conventions will be used? • How will you approach versioning your data? • How will you capture / create this documentation and metadata? • How do you ensure the completeness of the captured data? Making Data Accessible Specify which data will be made openly available taking into consideration • What ethics and legal compliance issues do you have if any? Do you need consent for data preservation and sharing? Do you have to protect certain data? Is any data sensitive? • Do you think you might have Intellectual Property Rights issues? Have you considered ownership of the data, licensing, restrictions on use? • Do you think you will need to embargo any data? • How will you make the data available? (consider the platforms you will use: databases, repositories, etc) • What methods or software tools are needed to access the data? shoudl you include documentation detailing how to access use/access the software that is needed for accessing the data? Is it possible to include this software with the data (e.g. source code, docker etc) • If there are any restrictions on accessibility, how will you provide access? Making Data Interoperable • What standards (metadata vocabularies, formats, checklists) or methodologies will you use? • How do you address data and model quality?What validation steps do you foresee? • Will you use standardised vocabulary for all data types to allow inter-disciplinary interoperability? • Where you can not used standardised vocabulary for all types of data, can you map to more commonly used ontologies? Making Data Re-usable • How will you licence your data to permit the widest re- use possible? • When will the data be made available for re-use? Does this include an embargo period? (if so, why?) • Which data will be available for re-use during/after the project? If not, why? • What are your data quality assurance processes? • How long do you expect your data to remain re-usable?
  • 9. FAIRDOM Initiative - develop a community - establish an internationally sustained Data and Model Management service - joint action of ERA-Net EraSysAPP and European Research Infrastructure ISBE
  • 10. A bit of history : 11Year Anniversary 2008 2010 2014 2018 2012 2016 2020 Standards based asset management (data, models, workflows, SOPs…) for multi-party projects Sensitive sharing Self-deposit / curation Mixed stewardship skills Legacy local systems Community resources Started in Systems Biology. Now widened.
  • 11. SEEK Software - Open source web platform for sharing scientific research assets, processes and outcomes - Associations between data along with information about the people and organisations (yellow pages) - ISA (Investigation, Study, Assay) structure for describing how individual experiments are aggregated into studies and investigations - Flexible and detailed sharing permissions - DOI can be generated for individual items, or entire aggregates - Semantic technology, allowing sophisticated queries over the content - Collection of meta data https://seek4science.org/
  • 13. Catalogue of distributed data Personal Data Local Stores External Databases Articles Models Standards SOPs
  • 15. Investigation - Study - Assay https://fairdomhub.org/investigations/56
  • 16. Investigation: Glucose metabolism in P. falciparum trophozoites Study: Model construction Study: Model validation Assay: LDH Assay: PK Assay: ENO Assay: PGM Assay: PGK Assay: GAPDH Assay: TPI Assay: ALD Assay: PFK Assay: PGI Assay: HK Assay: GLCtr Assay: PYRtr Assay: LACtr Assay: G3PDH Assay: GLYtr Assay: ATPase Data: GLCtr Model: GLCtr Data: HK Model: HK Steady state Incubation penkler1 Validation data penkler2 Validation data ... ... SOP: GLCtr SOP: HK ... SOP: Validation Assay: Culturing Assay: Lysate prep. SOP: Culturing SOP: Lysate prep. Design an ISA Investigation - Study - Assay
  • 18. Data Files, SOPs, Documents - no file format restrictions - some formats allow to view the content in SEEK: e.g.Excel,Word, PDF, XML, PNG
  • 19. Models SBML Model simulation Model comparison Model versioning Reproducing simulations [Jacky Snoep, Dagmar Waltemath, Martin Peters, Martin Scharm]
  • 21. Tracking model versions smartly Scharm, M.,Wolkenhauer, O., &Waltemath, D. (2015). An algorithm to detect and communicate the differences in computational models describing biological systems. Bioinformatics
  • 22. SpreadsheetTemplates Embed ontologies into Excel templates Excel spreadsheets enriched with ontology annotations Upload, extract metadata and register http://www.rightfield.org.uk
  • 23. Samples Generation of templates for sample types Sample extraction from spreadsheets HTP sample referencing and metadata migration
  • 26. Publishing in SEEK - DOI https://fairdomhub.org/investigations/56
  • 27. Publishing in SEEK Fix state with particular versions Active entry continues to evolve Assign a DOI
  • 29. More than simple supplementary materials 16 datafiles (kinetic, flux inhibition, runout) 19 models (kinetics, validation) 13 SOPs 3 studies (model analysis, construction, validation) 24 assays/analyses (simulations, model characterisations) Penkler, G., du Toit, F., Adams, W., Rautenbach, M., Palm, D. C., van Niekerk, D. D. and Snoep, J. L. (2015), Construction and validation of a detailed kinetic model of glycolysis in Plasmodium falciparum. FEBS J, 282: 1481–1511. doi:10.1111/febs.13237
  • 30. Scharm M,Wendland F, Peters M,Wolfien M,TheileT,Waltemath D SEMS, University of Rostock zip-like file with a manifest & metadata - Bundling files - Keeping provenance - Exchanging data - Shipping results Bergmann, F.T.,Adams, R., Moodie, S., Cooper, J., Glont, M., Golebiewski, M., ... & Olivier, B. G. (2014). COMBINE archive and OMEX format: one file to share all information to reproduce a modeling project. BMC bioinformatics,15(1), 1. Packaging: COMBINEarchive
  • 31. Standards-based metadata framework for bundling (scattered) resources with context and citation Packaging: Research Objects http://researchobject.org
  • 32. SEEK as project-specific local instances or as central FAIRDOMHub Service hosted at HITS (Institutional Guarantee at least until 2029)
  • 33. FAIRDOMHub Statistics 1st July 2019 Programmes 60 Projects 144 Institutions 274 People 1291 Data files 2280 Models 487 SOPs 301 Sample types 63 Presentations 729 Publications 370 Events 178
  • 34. FAIRDOM Platform Free and Open Source Front end Project(s) Hub Back end Onsite storage & analytics On site Tracking, data analytic pipelines, Extract,Transform and Load direct from the instruments, large data management LIMS, auto-archiving Web-based portal Project controlled spaces Metadata catalogue &Yellow pages Results repository, dissemination and collaboration Tool gateway Built using Built using
  • 35. Back end Instrument Data Management, LIMS, ELN • samples • protocols • instruments • data management • experimental description Norway’s national e-Infrastructure for Life Science https://nels.bioinfo.no/ Electronic Laboratory Notebook and Laboratory Information Management System (ELN-LIMS) https://csb.ethz.ch/tools/software/openbis-lims-eln.html
  • 36. [Adapted from Ursula Klingmüller, Martin Böhm] Excemplify Antibody Database FAIR collaboration from the ERANet ERASysAPP
  • 37. 38 Programme Overarching research theme (The Digital Salmon) Project Research grant (DigiSal, GenoSysFat) Investigation A particular biological process, phenomenon or thing (typically corresponds to [plans for] one or more closely related papers) Study Experiment whose design reflects a specific biological research question Assay Standardized measurement or diagnostic experiment using a specific protocol (applied to material from a study) Jon Olav Vik, Norwegian University of Life Science Integration with Norway’s national einfrastructure for Life Science (NeLS)
  • 38. • Project controlled protected spaces – Working space, show space for results – Supp. materials space for publications – Yellow pages and collaboration – Upload or link to data • One place catalogue – Regardless of physical store – ISA with shared metadata – Standards-compliant • Linked with other systems – Project on-site (secure) repositories – Public deposition archives – Integrated with JWSOnline modelling tools Front End Find, Access and Organise assets “Using FAIRDOMHub my own lab colleagues saw what I was doing and called to collaborate!” https://fairdomhub.org/
  • 39. Catalogue across repositories regardless of location In House Stores External Databases Publishing services Secure Stores Model Resources Upload or Reference
  • 41. Metadata Exchange along the Pipeline ELNs
  • 42. PALs - Project Area Liaisons PALs DM Team Data management training Requirements & Suggestions • Training needs for users • Suggestions to improve SEEK • Requirements for new SEEK features and DM services
  • 43. PALs - Project Area Liaisons - our user focus group - post docs, postgrads and techs - experimentalists, modellers and bioinformaticians - advocates and communicate our progress back to their projects
  • 44. Data Stewards function, profession, cultural shift • 500,000 needed in Europe* • Specialist skills • Career pathways • Recognition Curation and management • Supported, Resourced • Recognised, Rewarded Sharing policy and practice embedded * Realising the Open European Science Cloud (2016)
  • 47. LiSyM (Liver Systems Medicine) German Research Network on Systems Medicine for Liver Disease Supported by The German Federal Ministry of Education and Research 2016-2020 Multiple disciplines • Medicine • Biology, Biochemistry • Pharmacology • Physics • Bioinformatics • Data management • Industry 38 independent research groups: • Bayer AG • Max Planck (Dresden and Berlin) • MEVIS Fraunhofer (Bremen) • Leibniz Institute IfaDo (Dortmund) • Charité (Berlin) • DKFZ (Heidelberg) • Hospitals: Dresden, Kiel,Aachen, Homburg, Berlin, Heidelberg, Munich • + 18 Universitieshttp://www.lisym.org/
  • 49. Clinical data sharing concept Goal: • Diffuse description of data throughout consortium Challenge: • Some partners cannot share Solution: • Share table structure • Create & share common code • Distributedly create summaries
  • 50. NMTrypI Trypanosomiasis causes sleeping sickness, leishmaniasis and Chagas disease - in Africa, South America and India EU-funded project 2014 – 2017 Goal: new candidate drugs against Trypanosomatidic infections Consortium: 12 partners (3 SMEs and 9 academics) in Europe and in disease- endemic countries (Italy, Greece, Portugal, Spain, Germany, UK, Sudan, Brazil) https://fp7-nmtrypi.eu
  • 51. NMTrypI specific challenges • New visualizations of spreadsheet data • Cross-references with external databases • Chemical compound specific features – show structure – allow (sub)structure search – create compound summary reports
  • 52. xxxxx Visualization of enzyme inhibition by different compounds (in %) Heat map + Parallel coordinates plot xxxxxxxxxxx xxxxxxxxxx xxxxxx xxxxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx xxxx
  • 53. Automatic detection of UniprotIDs in Excel-table and link to UniprotKB and StringDB
  • 55. de.NBI -The German Network for Bioinformatics Infrastructure de.NBI consortium • 39 project partners • 30 institutions • 8 service centers https://www.denbi.de/ Mission • Provide, expand and improve specialized bioinformatics tools • Provide access to computing and storage capacities • Provide regular training events and workshops • Maintain and develop specific high-quality data resources
  • 56. Research and service topics of de.NBI service centers HD-HuB Bioinformatics Infrastructures in Biomedical Research • Human genetics and genomics • Metagenomics • Systematic phenotyping of human cells • Epigenetics BiGi Microbial Research for Biotechnology and Medicine • High performance computing services • Repository of reusable workflows • Comparative genomics and meta-omics • Post-genomics data integration BioData Reference Databases, Services and Tools • Ribosomal RNAs (SILVA) • Environmental data (PANGAEA) • Taxon-associated metadata (BacDive) • Enzymes & Ligands (BRENDA/EnzymeStructures) CIBI Tools for omics data and imaging • Open-source libraries (OpenMS, SeqAn, FIJI) • Tools for NGS, mass spec, and imaging • Workflow engine (KNIME) for automation • (Multi-)omics data analysis workflows RBC RNA Bioinformatics • Analysis of RNA-related data • Life science data analysis with Galaxy • Meta-transcriptomics • Epigenetic research de.NBI-SysBio Standards-based Systems Biology • Data and model management tools • SABIO-RK reaction kinetics data • Methods and tools for modeling in Systems Biology • Standards & tools for model search and management GCBN Crops and BioGreenFormatics • Plant genetic resources and traits • Bridging genotypes to phenotypes • Plant gene and genome annotation • Enabling technologies to improve crops BioInfra.Prot Bioinformatics for Proteomics • Comprehensive proteomics workflow • Data publication, analysis & tool services • Quality standards for targeted proteomics • Lipidomics de.NBI -The German Network for Bioinformatics Infrastructure
  • 57. Current Actions in de.NBI • Goal: Make Data FAIRness part of all de.NBI centers • Idea: Have service centers collect more metadata. No metadata, no service. • Approach: Build use cases that involve data management and service centers Two example use cases: Medical proteomics center • Statistical advice service – tracking of advice given – making reports FAIR • From data to PRIDE – Catalogue links to PRIDE in SEEK/FAIRDOMHub – Store and standardise intermediate files
  • 58. Summary FAIRDOM FAIRDOM Software Platform+Tools A Central Public Hub for Projects Customised Project Installations Project Stewardship Consultancy Services Community Activities 144 Projects 30+ Installations
  • 59. Summary FAIRDOM Find & Access Central catalogue Link to original files and external resources Search Metadata tagging and standards Yellow pages of projects and people Access control to spaces Embedded tools Interoperate Rich metadata, standards compliance Consistent reporting – ISA Curation support Integration with other resources, archives, tools Export packages Reuse Secure sharing space Long term retention Reproducible publication
  • 60. - Where do you store your experimental data? - What happens with data when a PhD students leaves the group? - Are all data complete for a publication? - Do you make regular backups of your local machine? - Do you send emails to share data with your colleagues? - Do you always store email attachements in your local directory? - Do you store all different versions of a data file together in the same place? - Which protocol was used for the experiment? ... Why do you need data management?
  • 61. What can you do? Be FAIR! 1. make a Data Management Plan 2. use standard identifiers 3. use metadata standards 4. catalogue / register data with metadata 5. define and share your SOPs 6. use data (assets) management platforms and tools that work together 7. deposit into public archives 8. have a sustainability / end project plan 9. resource and support, and that means people too 10. embed data management into work practices and do some training 11. give credit 12. check if you have sensitive data issues 13. educate your supervisors, institutions and peers
  • 63. Thanks to our sponsors, partners and collaborators