2. All comic-style graphics in this presentation were done either by Anna
Zhukova or by Martin Peters. Thank you very much!
Disclaimer
2
3. Who I am and
what I do
Projects. SEMS | de.NBI data management for German
Bioinformatics network | SBGN-ED+
Community work. Standard development | COMBINE
coordinator | SBML editor
Research interests. Reproducibility of modeling
results | Sustainability of scientific outcomes
Other things. Education of young scientists | Open
Access & open data | Gender equality in science
SEMS@University of Rostock, Germany (2015)
3
4. Model management.
Or: How I got into this reproducibility topic...
4
Reproduce
simulations
Ship & archive
modeling results
Detect
differences
Understand
model evolution
Develop management
strategies for models
2008 2012 2014 2016
5. Why many want data managed
I need
support in
organising
the data for
my thesis.
Funders say
I must make
all project
data available
for the next
10 years.
I need to share
parts of my
data with
collaborators
and want to
keep track.
These are only some examples.
5
6. ...and why they still don’t do it.
This
takes
time.
The software
does not
support the
format I need
for my data.
I do not want
to share my
data. I want
full control.
These are only some examples – there are many, many more.
6
7. 50+ %
of research studies are not reproducible*!
But why they should …
7*study performed by Bayer (2011) to check replicability of 67 results in cancer studies.
More in: Waltemath & Wolkenhauer (2016) IEEE TBME
8. Problem: Many data items
Characteristics of the data
– Heterogeneous
– Big
– Distributed
– Complex
8
9. Problem: Many data items
Characteristics of the data
– Heterogeneous
– Big
– Distributed
– Complex
9
Requirements of the field
– Long-term availability
– Thorough documentation/trust
– High data quality
– Interoperability & reusability
11. Use & follow a data
management plan
Data management
●
procedures and actions that help to store, preserve, organize and
control the data generated during a (research) project.
Examples & resources
●
Data management plans provided by funders, e.g. NIH
●
Checklist for a data management plan
11
12. Use & follow a data
management plan
Key principles
●
Avoid re-collection of data
●
Keep control of data at all steps of the data life cycle
●
Justify data collection Specify the collected data
●
Perform data audit
●
Archive the data
12
Is the data archived properly?
What are the planned
destruction mechanisms?
What kind of data is
collected? How was it
processed?
Is the data fit for purpose
and held securely?
Is the data useful and the
data collection effective?
13. Use a dedicated model
management system
Benefits
– Your data is organised and documented.
– Your data is kept safe (backup) and secure.
– User and sharing management for small and large
projects, and for work groups.
– Management functionality comes for free, e.g.
interlinks to other databases, version control,
search!
13
14. Use a dedicated model
management system
Example: FAIRDOMHub
– Data & model management for Systems Biology
– Follows the FAIR principles (Wilkinson et al 2016)
– User support, PALs meetings, online tutorials
– Project based instances, ISAtab, but flexible
14More information at: https://www.fairdomhub.org/
15. Use a dedicated model
management system
15More information at: https://www.fairdomhub.org/
16. Use a dedicated model
management system
16More information at: https://www.fairdomhub.org/
17. Use a dedicated model
management system
17
Version 2 Version 4
More information at: https://www.fairdomhub.org/
18. Use standards for data
sharing and interoperability
18Fig.: Mosaic of standards, adapted from Chelliah et al (2009) DILS
Guidelines, ontologies
and standards for
modeling & simulation
of biological systems.
19. Use standards for data
sharing and interoperability
19Figure: Draeger and Palsson (2014). More on COMBINE at: http://co.mbine.org
Help developing
standards
Access to all
specifications
Tutorials, forums,
mailing lists
Events
Guidelines, ontologies
and standards for
modeling & simulation
of biological systems.
20. Publish, share & archive your
study in a model repository
20
Curated
Open
Standard formats
Repositories: BiGG, BioModels, JWS Online Model Database, Physiome Model Repository
21. Publish, share & archive your
study in a model repository
21
Curated
Open
Standard formats
Repositories: BiGG, BioModels, JWS Online Model Database, Physiome Model Repository
22. Care for your models’ quality
●
MIASE and MIRIAM Guidelines → read, understand, implement.
●
COMBINE annotations (RDF / OWL / Bio-ontologies)
– To annotate models: COPASI, libSBML
– To annotate simulations: SED-ML Web Tools, JWS Online Simulator
– Specifically: Add SBO terms wherever possible to improve later
conversion between standards*
22*Format converters for COMBINE standards Rodriquez et al (2016)
Semantic
annotations
to bio-
ontologies
Qualityenhancer
23. Care for your models’ quality
●
Open publication in model repositories, e.g.: in BioModels,
JWS Online Model Database, Physiome Model Repository
●
Full documentation of provenance, e.g.: Research Object framework
Export and publish study as COMBINE Archive, e.g.: using
COMBINE Archive Web, JWS Online, SED-ML Web Tools
23
Documented,
reproducible
simulation
study
Qualityenhancer
Link: JWS Online Simulation Database. Peters et al (2016, under revision)
24. Care for your models’ quality
●
Functional curation (testing models under a range of perturbations), e.g.:
in the Cardiac Electrophysiology Web Lab
●
Documentation of origin for all parameter values
●
Linking model – simulation studies – experimental data – conditions –
simulation data – publication
24
Validation
of model
behavior
Qualityenhancer
25. Care for your models’ quality
25
Validation
of model
behavior
Qualityenhancer
Figures: Electrophysiology Web Lab Cooper et al (2016)
26. In summary: Make your
study valuable & sustainable
Check reproducibility prior to publication!
26Steps towards making a study reproducible: Henkel et al (2013), Springer – closed access :(
27. If your work is
available,
documented,
and open
We can index it,
so it can be
retrieved by
others.
27
28. Collecting & integrating
modeling data
MASYMOS: Store models
28Figure (left): Visualising database content for 6 BioModels & versions (courtesy M. Peters),
Figure (right): Henkel et al (2013) DATABASE
30. Provenance – who changed
what when where and why?
BiVeS: Keep track of changes in a model
30More information in: Scharm et al (2015) BIOINFORMATICS, https://sems.uni-rostock.de/projects/bives/
31. Provenance – who changed
what when where and why?
31Figure: courtesy V. Touré, Scharm et al (in preparation), http://most.sems.uni-rostock.de
version 3
05-06-2006
version 5
05-01-2007
version 4
03-10-2006
BIVES diff
3-4
BIVES diff
4-5
version 13
26-01-2010
version 15
15-04-2011
version 14
30-09-2010
MOST: Keep track of changes in public model repositories
32. Reusable models
Fully featured
COMBINE archive
Example of a
complete
COMBINE archive
(BIOM 144).
Recon 2
Reconstruction of
human
metabolism
reuses existing
networks.
Whole cell model
Based on >170
publications. All
model-related
data & code
available.
These are only some examples. Much to explore on BioModels, FAIRDOMHub, biosharing, ...
32
33. Thank you!
Contact me if you want:
•
help with our tools
•
help with COMBINE standards
•
set up a FAIRDOMHub project
•
get involved in all the exciting efforts.
Ron Henkel
MASYSMOS
Martin Peters
M2CAT, JWS, MASYMOS
Martin Scharm
BiVeS, Web Lab
Tom Gebhardt
MOST
Vasundra Touré
SBGN-ED
Mariam Nassar
Ranking, MASYMOS@dagmarwaltemath
Orcid: 0000-0002-5886-5563