Presentation on how to enable model reuse in systems biology. Presented as part of the series "Führende Köpfe in der IT - Wissenschaftlerinnen im Dialog" (ZB Med, Bonn, Germany)
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
2019 07-04-model reuse-bonn
1. from paper-based
model description to
interactive simulation
of disease progression
PROF. DR.-ING. DIPL.-INF. DAGMAR WALTEMATH
MEDICAL INFORMATICS | INSTITUTE FOR COMMUNITY MEDICINE
UNIVERSITY MEDICINE GREIFSWALD (GERMANY)
MODELREUSEWITHJOY
2. About me
SEMS@University of Rostock, Germany (2015)
7/4/2019 DAGMAR WALTEMATH | MODEL REUSE WITH JOY 2
Projects. SEMS | de.NBI:SYSBIO | SBGN-ED+ |
INCOME | MIRACUM
Community work. Standard development | COMBINE
coordinator | SED-ML editor
Research interests. Data integration | Semantics |
Reproducibility of scientific results | Sustainability of
scientific outcomes
Further interests. Education of young scientists | Open
Access & open data | Gender equality in science
@dagmarwaltemath
0000-0002-5886-5563
3. How this talk is organised
THE HISTORY THE SCIENCE
Disclaimer: All comic-style graphics in this presentation
were done either by Anna Zhukova or by Martin Peters.
Thank you very much! Images downloaded from pixabay.
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 37/4/2019
4. Systems Biology is…
Systems biology is the science that studies
how biological function emerges from the
interactions between the components of living
systems.
… and how these emergent properties enable
or constrain the behavior of these
components.
(Slide adapted from: Olaf Wolkenhauer)
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 47/4/2019
5. Simulation models can take many forms.
MATHEMATICAL MODELS FURTHER APPROACHES
Fig.s: https://doi.org/10.1371/journal.pcbi.1002815, https://doi.org/10.1371/journal.pcbi.1004591
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 57/4/2019
6. Simulation models can be complex.
First in silico Whole Cell Model
Genome (525 genes), transcriptome, proteome and metabolome
incorporated
Describes whole life cycle of a single cell on molecular level, and
predicts a wide range of cellular behaviors, and
accounts for the specific function of every annotated gene product
Based on 900 publications
Consists of 116 MATLAB files
Incorporates over 1.900 experimentally observed parameters
WHOLE-CELL MODEL KEY FIGURES
Fig.: Karr et al. (2012), https://doi.org/10.1016/j.cell.2012.05.044
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 67/4/2019
8. Publishing the model
PAPER AVAILABLE INFORMATION
1) (textual) description of work and related
efforts (referencing other papers)
2) (textual and visual) description of
(biochemical) network
3) (printed) model parameters
4) (printed) mathematical equations
5) resulting plots
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 8
Fig.: http://doi.org/10.1073/pnas.88.16.7328
7/4/2019
9. What can you do with this model?
STUDY THE PAPER, BELIEVE RE-IMPLEMENT BASED ON THE PAPER
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 97/4/2019
11. Publishing the model
PAPER AVAILABLE INFORMATION
1) (textual) description of work and related
efforts (referencing other papers)
2) (textual and visual) description of
(biochemical) network
3) (printed) model parameters
4) (printed) mathematical equations
5) resulting plots
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 11
Fig.: http://doi.org/10.1073/pnas.94.17.9147
7/4/2019
12. Publishing the model code
SIMULATION MODEL AVAILABLE INFORMATION
1) Description of (biochemical) network in
computer-readable format (SBML)
2) Mathematical equations in computer-
readable format (MathML)
3) Model parameters inside model code
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 127/4/2019
13. What can you do with this data?
CHECK THE MODEL (REPRODUCIBILITY)
RE-USE THE CODE IN ANOTHER SOFTWARE
(INTEROPERABILITY)
Fig. (left) JWS Online, http://jjj.mib.ac.uk/models. Fig. (right) courtesy M.Hucka (2016),
https://www.slideshare.net/thehuck/recent-software-and-services-to-support-the-sbml-community
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 137/4/2019
15. Publishing the model & code
PAPER SIMULATION MODEL
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 15
Fig.: https://doi.org/10.1038/msb4100171
7/4/2019
16. Publishing the meta-data
on repository – model – and entity level
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 16
Harmonised meta-data for simulation models in computational biology: Neal et al. (2018), Briefings in Bioinformatics (https://doi.org/10.1093/bib/bby087)
7/4/2019
17. Publishing the simulation setups
COMBINE ARCHIVE
manifest.xml Omex Skeleton, automatically generated by WebCAT
metadata.rdf Omex Skeleton, automatically generated by WebCAT
README.md Markdown Human readable information for users stumbling upon the archive
model/
BIOMD0000000144.xml SBML L2V1 origin: www.ebi.ac.uk/biomodels-main/download?mid=BIOMD0000000144
calzone_2007.svg SVG origin: models.cellml.org/workspace/calzone_thieffry_tyson_novak_2007
calzone_2007.ai Illustrator origin: models.cellml.org/workspace/calzone_thieffry_tyson_novak_2007
calzone_2007.png PNG origin: models.cellml.org/workspace/calzone_thieffry_tyson_novak_2007
calzone_thieffry_tyson_novak_2007.cellml CellML 1.0 origin: models.cellml.org/workspace/calzone_thieffry_tyson_novak_2007
sbgn/Calzone2007.gml GML SBGN compliant figure generated using SBGN-ED
sbgn/Calzone2007.graphml GraphML SBGN compliant figure generated using SBGN-ED
sbgn/Calzone2007.pdf PDF SBGN compliant figure generated using SBGN-ED
sbgn/Calzone2007.png PNG SBGN compliant figure generated using SBGN-ED
sbgn/Calzone2007.sbgn SBGN-ML SBGN-ML encoded figure generated using SBGN-ED
experiment/
Calzone2007-default-simulation.xml SED-ML L1V1 Simulation description generated using SED-ML Web Tools
Calzone2007-simulation-figure-1B.xml SED-ML L1V1
Simulation description generated using SED-ML Web Tools based on
Calzone2007-default-simulation.xml
documentation/
Calzone2007.pdf PDF
Scientific publication “Dynamical modeling of syncytial mitotic cycles in
Drosophila embryos”obtained from msb.embopress.org/content/3/1/131
Calzone2007-supplementary-material.pdf PDF
Supplementary information for the publication obtained from
msb.embopress.org/content/3/1/131
result/
Fig1B-bottom-COPASI.svg SVG
Imagegenerated by executing Calzone2007-simulation-figure-1B.xml on
BIOMD0000000144.xml in COPASI
Fig1B-top-COPASI.svg SVG
Imagegenerated by executing Calzone2007-simulation-figure-1B.xml on
BIOMD0000000144.xml in COPASI
Fig1B-bottom-webtools.png PNG
Imagegenerated by executing Calzone2007-simulation-figure-1B.xml on
BIOMD0000000144.xml in SED-ML Web Tools
Fig1B-top-webtools.png PNG
Imagegenerated by executing Calzone2007-simulation-figure-1B.xml on
BIOMD0000000144.xml in SED-ML Web Tools
AVAILABLE INFORMATION
1) Paper and additional information
2) Meta-data
3) Graphical representation of model (SBGN)
4) Alternative parametrisations (SED-ML)
5) Model versions
6) Simulation experiments (SED-ML)
Example archive available from: https://github.com/SemsProject/CombineArchiveShowCase/
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 177/4/2019
18. What can you do with an archive?
Explore data
and meta-data
Identify
Data set
of interest
Run model
Online/
offline
Safe new versions and
documentation in archive
Modify,
merge,
extend,
combine...
Re-publish
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 18
Download
Archive
7/4/2019
19. What can you do with an archive?
Example: Download archive from Github and run it in JWS Online
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 197/4/2019
20. What does the (near) future bring?
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 207/4/2019
21. Linking models and data simplifies verification
of models, and experimental data sets.
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 21
Integrating Disease maps and Biomedical
data (e.g., https://pdmap.uni.lu/minerva/)
Linking models and experimental data sets
(e.g., JWS Online)
7/4/2019
22. Connecting pathways, ontologies and datasets
leads to new means of data exploration.
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 22
Comprehensive knowledge of cancer signaling networks and linked data,
working with interactive Pathway Maps, https://acsn.curie.fr/ACSN2/ACSN2.html
7/4/2019
23. Easy access to patient-specific liver disease
progression helps doctors choose a therapy.
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 23
Fig.: Koenig et al. (2016), ODLS, Halle (Saale), http://livermetabolism.com
7/4/2019
24. The pillars of success
WHAT‘S THE SECRET?
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 247/4/2019
25. The research field develops and adheres to
FAIR standards for modeling and simulation.
Data formatsRecommendations Semantic / Ontologies
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 257/4/2019
26. Data formats are interoperable and are
being developed collaboratively.
Editorial Boards
Specifications
Software tool support
http://co.mbine.org/standards
Standard development Meetings
Annual special issue with
list of latest specifications
and errata
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 267/4/2019
27. The community builds, feeds & uses
open repositories for simulation studies.
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 277/4/2019
28. The community actively develops open,
standard-compliant libraries & tools.
MODELING AND SIMULATION SOFTWARE REPOSITORIES & MANAGEMENT TOOLS
…
Full list available at: http://sbml.org/SBML_Software_Guide/
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 287/4/2019
29. The (data) Science
DEVELOPMENT OF MODEL MANAGEMENT STRATEGIES
BY SEMS & FRIENDS (2011-2019)
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 29
30. Characteristics of the data
Heterogeneous
Big
Distributed
Complex
Highly connected
But
Good standards available to represent the
data
Agreed-upon semantic annotation schemes &
ontologies to enrich the data
Open data movement
Community spirit
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 307/4/2019
31. Issues that SEMS investigated 2012-17
Handling the steadily increasing size & numbers of models and studies (database performance)
Increasing the quality of published models (semantic annotations, reproducibility of results)
Keeping track of model changes and relations
(comprehensibility)
Identifying and handling similarities
in model representations (reuse)
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 31
~ 300.000 models in
BioModels Database,
on average 5 versions per
model.
XML, RDF, OWL
7/4/2019
32. A graph-based approach keeps storage
and retrieval efficient.
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 32
Document
SEDML
Modelrefere
nce
Output
Datagenera
tor
Simulation Task
Variable
Variable
Document
Tyson_1991
C2 CP
time
environment
isDescribedBy Pubmed:
1831270
time timeCPC2 CP C2
is_connected is_connected
is_mapped_to
is_connected
Document
Tyson1991
Cell Cycle 6
var
C2 pM CellReaction3 CP
Uniprot:P04551 Uniprot:P04551 GO:0005623
Interpro:
IPR006670isVersionOf
isVersion
hasPart
is
asProduct
asReactant isContainedIn
Pubmed:
1831270
Kegg Pathway
sce04111
isDescribedBy
is
EC-Code:
3.1.3.16
isVersionOf
Example: Tyson 1991 (BIOM5), Source: Waltemath & Henkel, Neo4j Life & Health Sciences Day - Berlin, 21st June, 2017,
adapted from Henkel et al. (2015) DATABASE (https://doi.org/10.1093/database/bau130)
SBO:
Ontology
SBO:0000
SBO:544 SBO:236SBO:231
isA
SBO:064 SBO:545SBO:004 SBO:003
Models Simulation Annotation
7/4/2019
33. The linking of data sets on graph-level
allows for complex queries.
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 33
2 experiments,
3 model versions,
changes,
meta-data
Fig.: Martin Peters, SEMS
Fig (right): Henkel et al. (2015) DATABASE, https://doi.org/10.1093/database/bau130
7/4/2019
34. Lucene-based indices incorporate all relevant
information for later search & comparison.
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 34
Model
Publication
Annotation
Person
Simulation
Document
Tyson1991
Cell Cycle 6
var
C2 pM CellReaction3 CP
Uniprot:P04551 Uniprot:P04551 GO:0005623
Interpro:
IPR006670
isVersionOf
isVersion
hasPart
is
asProduct
asReactant isContainedIn
Pubmed:
1831270
Kegg Pathway
sce04111
isDescribedBy
is
EC-Code:
3.1.3.16
isVersionOf
Document
SEDML
Modelrefere
nce
Output
Datagenera
tor
Simulation Task
Variable
Variable
Document
Tyson_1991
C2 CP
time
environment
isDescribedBy Pubmed:
1831270
time timeCPC2 CP C2
is_connected is_connected
is_mapped_to
is_connected
SBO:
Ontology
SBO:0000
SBO:544 SBO:236SBO:231
isA
SBO:064 SBO:545SBO:004 SBO:003
• Id
• Name
• Title
• Journal
• Abstract
• Authors
• …
• Id
• Name
• Component
• Variable
• Species
• Reaction
• Compartment
• First name
• Last name
• Organization
• Email
• URI
• Description
Fig.: Henkel et al. (2015) DATABASE
7/4/2019
35. A weighted ranked-retrieval methods
returns only most relevant models.
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 35
Document
Tyson1991
Cell Cycle 6
var
C2 pM CellReaction3 CP
Uniprot:P04551 Uniprot:P04551 GO:0005623
Interpro:
IPR006670
isVersionOf
isVersion
hasPart
is
asProduct
asReactant isContainedIn
Pubmed:
1831270
Kegg Pathway
sce04111
isDescribedBy
is
EC-Code:
3.1.3.16
isVersionOf
Annotation
Person
Show me models by
Tyson describing the cell
cycle and having cdc2
1. (0.859) Tyson1991 - Cell Cycle 6 var
2. (0.854) Tyson2001_Cell_Cycle_Regulation
3. (0.477) Chen2004 - Cell Cycle Regulation
Which are the most frequently used
GO annotations in my model set?
Which models contain reactions
with 'ATP' as reactant and 'ADP'
as product?
Find good candidates for
features describing my model set.
Which models are annotated
with ‘Ubiquitin'’?
Give me all the files I need to
run this simulation study.
Fig.: Henkel et al. (2015) DATABASE
7/4/2019
36. A method to detect and track differences
in model versions ensures transparency.
How did my model change between version x and X+1?
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 36
„Sophisticated“ XYDIFF & change ontology
How often did this model
change, when and wy?Give me all versions of this
model.Figs.: Waltemath et al. (2015) Oxford Bioinformatics (https://doi.org/10.1093/bioinformatics/btt018);
Implementation: M. Scharm, https://github.com/SemsProject/BiVeS
7/4/2019
37. Identification of frequent pattern in network
graphs helps determine structural similarity.
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 37
Fig.: Size and number of reactions and participating species (left), and identified frequent patterns (right).
Implementation: Fabienne Lambusch. Figure: Lambusch et al. (2018) DATABASE (https://doi.org/10.1093/database/bay051)
7/4/2019
38. Identification of frequent pattern in network
graphs helps determine structural similarity.
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 38
Fig.: Tyson BIOM5 (left), and identified patterns based on the (right).
Implementation: Fabienne Lambusch. Figure: Lambusch et al. (2018) DATABASE (https://doi.org/10.1093/database/bay051)
How similar are these two models
with respect to structure?
Give me all models with
this particular sub-structure.
7/4/2019
40. Implementing model version control in the FAIRDOMHub
Internal use of BIVES difference detection for SBML models
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 407/4/2019
41. Change statistics for model versions
Internal use of BIVES difference detection for SBML and CellML models, Change ontology COMODI, SBGN Visualisation tool DiViL;
https://most.bio.informatik.uni-rostock.de, Scharm et al (2018), BMC SysBio (https://doi.org/10.1186/s12918-018-0553-2)
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 41
BIOM7
7/4/2019
42. Change statistics for model versions
Internal use of BIVES difference detection for SBML and CellML models, Change ontology COMODI, SBGN Visualisation tool DiViL;
https://most.bio.informatik.uni-rostock.de, Scharm et al (2018), BMC SysBio (https://doi.org/10.1186/s12918-018-0553-2)
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 42
BIOM7
7/4/2019
43. Change statistics for model versions
Internal use of BIVES difference detection for SBML and CellML models, Change ontology COMODI, SBGN Visualisation tool DiViL;
https://most.bio.informatik.uni-rostock.de, Scharm et al (2018), BMC SysBio (https://doi.org/10.1186/s12918-018-0553-2)
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 43
BIOM7
7/4/2019
44. Ranked retrieval of reproducible simulation studies
Internal use of the COMBINEArchive-library, MORRE, MASYMOS, http://cellml.org/models
Internal use of the COMBINEArchive library, SEDMLlibrary, https://jjj.biochem.sun.ac.za/
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 447/4/2019
45. …we can help
you manage it,
so it can be
retrieved and
reused by others.
If your work is
standardised,
documented,
and open
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 45
47. Standardisation and integration of data
improved model accessibility and reusability.
COPPIC FOREST (DECORTICATED)
Matlab logo: By Jarekt (Own work) [Public domain], via Wikimedia Commons; Python logo: By www.python.org [GPL, via Wikimedia Commons];
Java logo: By Cguevara94 (Own work) [CC BY-SA 4.0], via Wikimedia Commons, modified.
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 47
PATH (ACCESSIBLE)
7/4/2019
48. Biological data is well-integrated with simulation
models, but biomedical/clinical data lacks behind.
DAGMAR WALTEMATH | MODEL REUSE WITH JOY 48
49. Thank you for
your attention
Dagmar Waltemath
University Medicine Greifswald
@dagmarwaltemath
0000-0002-5886-5563
Contact me to adopt a SEMS –
work in Greifswald or clone a github repository!