SlideShare una empresa de Scribd logo
1 de 32
Descargar para leer sin conexión
Data and model management
in Systems Biology
Dagmar Waltemath
University of Rostock, Germany
Kinetics on the move – Happy 10th
anniversary to SABIO-RK!
Heidelberg, 31st
May, 2016
http://www.slideshare.net/dagwa/data-and-model-management-in-systems-biology
2
Junior research group: Management of
simulation studies in systems biology
Tool development: SBGN-ED for the
graphical representation of networks
Infrastructure: Data management for
systems biology in Germany
Standards and tools for model management
www.sems.uni-rostock.de
© 2009 UNIVERSITÄT ROSTOCK 3
NBI-SysBio: Data management for systems biology in Germany
3
●
Sustainable infrastructure for data management
● Access to documented and reproducible results
● Systems Biology Standards
●
Tool Development
● Education
www.denbi.de (training – services – jobs)
© 2009 UNIVERSITÄT ROSTOCK 4
Photo: NY - http://nyphotographic.com (CC BY-SA 3.0) Photo: janneke staaks on flickr
Fig. courtesy 10.1371/journal.pbio.1001779
TM
© 2009 UNIVERSITÄT ROSTOCK 5
Data management is …
●
Data management describes procedures and actions that
help to store, preserve, organize and control the data
generated during a (research) project.
●
Aspects of data management include:
– Data Ownership;
– Metadata Compilation;
– Data Lifecycle Control;
– Data Quality;
– Data Access and Dissemination Photo: NY - http://nyphotographic.com (CC BY-SA 3.0)
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 6
●
Data about data
●
Improved understanding of encoded data items
●
Descriptive details
●
Discovery and search for existing data, online browsing of data
●
Standardized and structured information
– Purpose, origin, time references, geographic location, creator, access conditions,
and terms of use of your data collection
●
Often encoded in ontologies
https://www.libraries.psu.edu/psul/pubcur/what_is_dm.html#data-management
Metadata
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 7
●
Well-structured, controlled vocabularies
●
Capture and convey commonly agreed definitions and concepts in a domain
●
Communication across people and software tools
●
Enable reuse of domain knowledge
●
Make implicit domain knowledge explicit and queryable
●
Bio-ontologies
– Gene Ontology, ChEBI, UniProt
– Systems Biology Ontology (concepts and terminology for modeling)
Ontologies
8
Example: Definition of „cell growth“ in the Gene Ontology
5/31/16
id: GO:0016049
name: cell growth
namespace: biological_process
def: "The process in which a cell 
irreversibly increases in size over 
time by accretion and biosynthetic 
production of matter similar to that 
already present."
synonym: "cell expansion" RELATED []
synonym: "cellular growth" EXACT []
synonym: "growth of cell" EXACT []
is_a: GO:0009987 ! cellular process
is_a: GO:0040007 ! Growth
relationship: part_of GO:0008361 ! 
regulation of cell size
© 2009 UNIVERSITÄT ROSTOCK
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 9
●
Increased confidence and trust in the data
●
Better understanding of how to use the data, and of the data itself
●
Better data quality
●
Coherent data when standards are used
●
Improved business processes (saving time, guaranteeing high quality)
●
Improved access to data and improved reproducibility
●
Better exploitation of data through easier data exchange and
integration
Advantages of careful & planned data management
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 10
●
Reusable
●
Exchangeable
●
Interoperable
●
Long-term available (in open repositories)
●
Curateable
●
Shareable
Advantages of standardised data
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 11
Photo: janneke staaks on flickr
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 12
Research data in the modeling life cycle
Models
equations,
parameters,
data tables
Ideas
text,
drawings
Experimental
results
text,
data tables
Publications
text,
figures
Analyses
configuration files,
data tables
Fig. courtesy Martin Scharm (adapted)
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 13
Research data in the modeling life cycle
●
Mathematical formulae
●
Networks, diagrams
●
Image data
●
Publications
●
Experiment descriptions
●
Experimental results (both lab and simulation)
●
Definitions of things (e.g., gene functions, chemical structures...)
Figures top to bottom: (1) By Noah A. Rosenberget al. Slightly modified by User:Wobble. - Public Library of Science, CC BY 3.0,
https://commons.wikimedia.org/w/index.php?curid=2839383; (2) By http://rsb.info.nih.gov/ij/images/, Public Domain, https://commons.wikimedia.org/w/index.php?curid=655748;
(3) BIOM005, generated using CellDesigner 4, (4,5) PMID:18669651
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 14
●
Heterogenuous
●
Highly connected
●
Context-dependent
●
Distributed
●
Big
Research data in the modeling life cycle
Figures top to bottom: (1) By Noah A. Rosenberget al. Slightly modified by User:Wobble. - Public Library of Science, CC BY 3.0,
https://commons.wikimedia.org/w/index.php?curid=2839383; (2) By http://rsb.info.nih.gov/ij/images/, Public Domain, https://commons.wikimedia.org/w/index.php?curid=655748;
(3) BIOM005, generated using CellDesigner 4, (4,5) PMID:18669651
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 15
The model
●
Mathematical equations
●
Biological entities
●
Kinetic information
●
Encoding: & semantic annotations
TM
<bqmodel:isDescribedBy>
<rdf:Bag>
<rdf:li rdf:resource="http://identifiers.org/pubmed/18669651"/>
</rdf:Bag>
</bqmodel:isDescribedBy>
<parameter id="parameter_49" name="L" metaid="metaid_0000078" value="20670"/>
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 16
SBML – Standard for model encoding
●
Systems Biology Markup Language
●
Community-driven de-facto Standard
●
Free & open source: www.sbml.org
●
Supported by many organizations and tools
●
Encodes computational models of biological processes
(compartments – species – reactions - parameters)
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 17
SBGN – Standard for visual representation
●
Systems Biology Graphical Notation
●
Standardised glyphs for biological entities
●
Three levels
– SBGN-AF | SBGN-ER | SBGN-PD
●
Free & open source: www.sbgn.org
●
Tool support
●
Interpretable Format: SBGN-ML
Fig.: http:sbgn.org
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 18
Fig.: SBGN map for BIOM183, CellDesigner
SBGN – Standard for visual representation
Fig.: SBGN map for BIOM005, CellDesigner
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 19
●
Reproduce behaviour of the model
●
Publish and share virtualexperiments
– Simulation setup / conditions
– Pre- and post-processing
– Observations
●
Encoding: & & result data in Excel, CSV files
<listOfSimulations>
<uniformTimeCourse id="sim1" initialTime="0" outputStartTime="0"
outputEndTime="100" numberOfPoints="100">
<algorithm kisaoID="KISAO:0000019"/> </uniformTimeCourse>
</listOfSimulations>
The analysis
Fig. M. Stefan et al, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2596252/
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 20
SED-ML – Standard for model analysis
●
Links to models used in an analysis
●
Pre- and Post-processing of models
●
Type of simulation
●
Definition of output
●
Free an open source: www.sed-ml.org
●
Tool support
→Showcase your tool support online ←
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 21
SED-ML – Standard for model analysis
Fig. M. Stefan et al, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2596252/
Simulation of BIOM183 in SED-ML Web Tools without simulation description
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 22
m n
Coordinate annual meetings
Simulation
GuidelinesOntologies
- Next HARMONY:
Auckland, June 7-11, 2016
- Next COMBINE:
Newcastle, Sep 19-23, 2016
Coordinate standards development
- Common procedures
- Interoperable software tools
- Discussion forums, mailing lists...
Represent community
- Funders
- Other communities
Provide standards resources
- Single entry point
- Resolvable URI
- Web infrastructure
Standard-compliant software tools for modeling
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 23
The path2models project integrated data from different databases into
more than 140.000 SBML models.
Fig.: Büchel et al BMC Sys Biol (2013)http://www.ebi.ac.uk/biomodels-main/path2models
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 24
The Systems Biology Workbench is a software framework to help
heterogeneous application components communicate with each other.
Modeling
Editing
Simulating
Analysinghttp://sbw.sourceforge.net
Standard-compliant software tools for modeling
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 25
The decision whether and how to share data often rests with researchers. Roche DG, Lanfear R, Binning SA, Haff TM, Schwanz LE, et al. (2014)
Troubleshooting Public Data Archiving: Suggestions to Increase Participation. PLoS Biol 12(1): e1001779. doi:10.1371/journal.pbio.1001779
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 26
●
Bundling files
●
Shipping results
●
Exchanging data
●
Keeping provenance
●
Encoding: zip-like file with a manifest (meta-data)
●
Generate, modify & share through WebCAT
COMBINE Archive
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 27
COMBINE Archive
Original
publication
SBGN map
SBML model versions
SED-ML files
Open in Webcat
Open in SEEK
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 28
Model curation & publication
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 29
Model curation & publication
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 30
Model curation, simulation & publication
5/31/16 © 2009 UNIVERSITÄT ROSTOCK 31
Introduction to SEEK & FAIRDOM by Olga Krebs.
32
Thank you for your attention.
http://www.denbi.de/ @SemsProject
m nhttp://co.mbine.org

Más contenido relacionado

La actualidad más candente

bio data
bio databio data
bio data
007dcp
 
FAIR Data and Model Management for Systems Biology (and SOPs too!)
FAIR Data and Model Management for Systems Biology(and SOPs too!)FAIR Data and Model Management for Systems Biology(and SOPs too!)
FAIR Data and Model Management for Systems Biology (and SOPs too!)
Carole Goble
 
An experimental comparison of globally-optimal data de-identification algorithms
An experimental comparison of globally-optimal data de-identification algorithmsAn experimental comparison of globally-optimal data de-identification algorithms
An experimental comparison of globally-optimal data de-identification algorithms
arx-deidentifier
 
Engineering data privacy - The ARX data anonymization tool
Engineering data privacy - The ARX data anonymization toolEngineering data privacy - The ARX data anonymization tool
Engineering data privacy - The ARX data anonymization tool
arx-deidentifier
 
Let’s go on a FAIR safari!
Let’s go on a FAIR safari!Let’s go on a FAIR safari!
Let’s go on a FAIR safari!
Carole Goble
 

La actualidad más candente (20)

Reproducible and citable data and models: an introduction.
Reproducible and citable data and models: an introduction.Reproducible and citable data and models: an introduction.
Reproducible and citable data and models: an introduction.
 
Model repositories and standard formats for model reusability
Model repositories and standard formats for model reusabilityModel repositories and standard formats for model reusability
Model repositories and standard formats for model reusability
 
FuGE Update
FuGE UpdateFuGE Update
FuGE Update
 
A global integrative ecosystem for digital pathology: how can we get there?
A global integrative ecosystem for digital pathology: how can we get there?A global integrative ecosystem for digital pathology: how can we get there?
A global integrative ecosystem for digital pathology: how can we get there?
 
FAIR data and model management for systems biology.
FAIR data and model management for systems biology.FAIR data and model management for systems biology.
FAIR data and model management for systems biology.
 
Making your data good enough for sharing.
Making your data good enough for sharing.Making your data good enough for sharing.
Making your data good enough for sharing.
 
Investigating plant systems using data integration and network analysis
Investigating plant systems using data integration and network analysisInvestigating plant systems using data integration and network analysis
Investigating plant systems using data integration and network analysis
 
bio data
bio databio data
bio data
 
Citing data in research articles: principles, implementation, challenges - an...
Citing data in research articles: principles, implementation, challenges - an...Citing data in research articles: principles, implementation, challenges - an...
Citing data in research articles: principles, implementation, challenges - an...
 
FAIR Data and Model Management for Systems Biology (and SOPs too!)
FAIR Data and Model Management for Systems Biology(and SOPs too!)FAIR Data and Model Management for Systems Biology(and SOPs too!)
FAIR Data and Model Management for Systems Biology (and SOPs too!)
 
Report of the second FAIRDOM foundry
Report of the second FAIRDOM foundryReport of the second FAIRDOM foundry
Report of the second FAIRDOM foundry
 
An experimental comparison of globally-optimal data de-identification algorithms
An experimental comparison of globally-optimal data de-identification algorithmsAn experimental comparison of globally-optimal data de-identification algorithms
An experimental comparison of globally-optimal data de-identification algorithms
 
Introduction to FAIRDOM
Introduction to FAIRDOMIntroduction to FAIRDOM
Introduction to FAIRDOM
 
Opportunities in chemical structure standardization
Opportunities in chemical structure standardizationOpportunities in chemical structure standardization
Opportunities in chemical structure standardization
 
Entrez databases
Entrez databasesEntrez databases
Entrez databases
 
Engineering data privacy - The ARX data anonymization tool
Engineering data privacy - The ARX data anonymization toolEngineering data privacy - The ARX data anonymization tool
Engineering data privacy - The ARX data anonymization tool
 
An overview of methods for data anonymization
An overview of methods for data anonymizationAn overview of methods for data anonymization
An overview of methods for data anonymization
 
Let’s go on a FAIR safari!
Let’s go on a FAIR safari!Let’s go on a FAIR safari!
Let’s go on a FAIR safari!
 
ACS 248th Paper 136 JSmol/JSpecView Eureka Integration
ACS 248th Paper 136 JSmol/JSpecView Eureka IntegrationACS 248th Paper 136 JSmol/JSpecView Eureka Integration
ACS 248th Paper 136 JSmol/JSpecView Eureka Integration
 
2011 03-provenance-workshop-edingurgh
2011 03-provenance-workshop-edingurgh2011 03-provenance-workshop-edingurgh
2011 03-provenance-workshop-edingurgh
 

Destacado

Evalucion excrecion 7 grado. ITInFra
Evalucion excrecion 7 grado. ITInFraEvalucion excrecion 7 grado. ITInFra
Evalucion excrecion 7 grado. ITInFra
Wilson Montana
 

Destacado (14)

Drijfveer — Floriade 2022 | Almere Amsterdam
Drijfveer — Floriade 2022 | Almere AmsterdamDrijfveer — Floriade 2022 | Almere Amsterdam
Drijfveer — Floriade 2022 | Almere Amsterdam
 
SPICE MODEL of PV-MA2100KK , PSpice Model in SPICE PARK
SPICE MODEL of PV-MA2100KK , PSpice Model in SPICE PARKSPICE MODEL of PV-MA2100KK , PSpice Model in SPICE PARK
SPICE MODEL of PV-MA2100KK , PSpice Model in SPICE PARK
 
Bota feliz-navidad
Bota feliz-navidadBota feliz-navidad
Bota feliz-navidad
 
Building SADI Services Tutorial - SIB Workshop, Geneva, December 2015
Building SADI Services Tutorial - SIB Workshop, Geneva, December 2015Building SADI Services Tutorial - SIB Workshop, Geneva, December 2015
Building SADI Services Tutorial - SIB Workshop, Geneva, December 2015
 
Evalucion excrecion 7 grado. ITInFra
Evalucion excrecion 7 grado. ITInFraEvalucion excrecion 7 grado. ITInFra
Evalucion excrecion 7 grado. ITInFra
 
4 periodo 2015 8° ciencias naturales grado
4 periodo 2015 8° ciencias naturales grado4 periodo 2015 8° ciencias naturales grado
4 periodo 2015 8° ciencias naturales grado
 
Model management for systems biology projects
Model management for systems biology projectsModel management for systems biology projects
Model management for systems biology projects
 
Espaço Programação e Eletrónica - Sessão 2
Espaço Programação e Eletrónica - Sessão 2Espaço Programação e Eletrónica - Sessão 2
Espaço Programação e Eletrónica - Sessão 2
 
Bumi Semakin Panas
Bumi Semakin PanasBumi Semakin Panas
Bumi Semakin Panas
 
EV3#4: Exercicios com o sensor de toque
EV3#4: Exercicios com o sensor de toqueEV3#4: Exercicios com o sensor de toque
EV3#4: Exercicios com o sensor de toque
 
Systems biology & Approaches of genomics and proteomics
 Systems biology & Approaches of genomics and proteomics Systems biology & Approaches of genomics and proteomics
Systems biology & Approaches of genomics and proteomics
 
Espaço Programação e Eletrónica - Sessão5
Espaço Programação e Eletrónica - Sessão5Espaço Programação e Eletrónica - Sessão5
Espaço Programação e Eletrónica - Sessão5
 
System biology and its tools
System biology and its toolsSystem biology and its tools
System biology and its tools
 
Pemanfaatan Lingkungan Hidup Geografi
Pemanfaatan Lingkungan Hidup GeografiPemanfaatan Lingkungan Hidup Geografi
Pemanfaatan Lingkungan Hidup Geografi
 

Similar a Data and model management in Systems Biology

Kuchinsky_Cytoscape_BOSC2009
Kuchinsky_Cytoscape_BOSC2009Kuchinsky_Cytoscape_BOSC2009
Kuchinsky_Cytoscape_BOSC2009
bosc
 

Similar a Data and model management in Systems Biology (20)

Reproducibility of model-based results: standards, infrastructure, and recogn...
Reproducibility of model-based results: standards, infrastructure, and recogn...Reproducibility of model-based results: standards, infrastructure, and recogn...
Reproducibility of model-based results: standards, infrastructure, and recogn...
 
MADICES Mungall 2022.pptx
MADICES Mungall 2022.pptxMADICES Mungall 2022.pptx
MADICES Mungall 2022.pptx
 
A Global Commons for Scientific Data: Molecules and Wikidata
A Global Commons for Scientific Data: Molecules and WikidataA Global Commons for Scientific Data: Molecules and Wikidata
A Global Commons for Scientific Data: Molecules and Wikidata
 
Software Sustainability Institute
Software Sustainability InstituteSoftware Sustainability Institute
Software Sustainability Institute
 
FAIR data management in biomedicine
FAIR data management  in biomedicineFAIR data management  in biomedicine
FAIR data management in biomedicine
 
M2CAT: Extracting reproducible simulation studies from model repositories usi...
M2CAT: Extracting reproducible simulation studies from model repositories usi...M2CAT: Extracting reproducible simulation studies from model repositories usi...
M2CAT: Extracting reproducible simulation studies from model repositories usi...
 
M2CAT: Extracting reproducible simulation studies from model repositories usi...
M2CAT: Extracting reproducible simulation studies from model repositories usi...M2CAT: Extracting reproducible simulation studies from model repositories usi...
M2CAT: Extracting reproducible simulation studies from model repositories usi...
 
Pine education-platform
Pine education-platformPine education-platform
Pine education-platform
 
Kuchinsky_Cytoscape_BOSC2009
Kuchinsky_Cytoscape_BOSC2009Kuchinsky_Cytoscape_BOSC2009
Kuchinsky_Cytoscape_BOSC2009
 
Model Management in Systems Biology: Challenges – Approaches – Solutions
Model Management in Systems Biology: Challenges – Approaches – SolutionsModel Management in Systems Biology: Challenges – Approaches – Solutions
Model Management in Systems Biology: Challenges – Approaches – Solutions
 
FAIR data and model management for systems biology (and SOPs too!)
FAIR data and model management for systems biology (and SOPs too!)FAIR data and model management for systems biology (and SOPs too!)
FAIR data and model management for systems biology (and SOPs too!)
 
A consistent and efficient graphical User Interface Design and Querying Organ...
A consistent and efficient graphical User Interface Design and Querying Organ...A consistent and efficient graphical User Interface Design and Querying Organ...
A consistent and efficient graphical User Interface Design and Querying Organ...
 
INSTRUCT - Integrated Structural Biology Infrastructure
INSTRUCT - Integrated Structural Biology InfrastructureINSTRUCT - Integrated Structural Biology Infrastructure
INSTRUCT - Integrated Structural Biology Infrastructure
 
Saint: A Lightweight Model Annotation and Data Integration Tool
Saint: A Lightweight Model Annotation and Data Integration ToolSaint: A Lightweight Model Annotation and Data Integration Tool
Saint: A Lightweight Model Annotation and Data Integration Tool
 
Management of simulation studies in computational biology
Management of simulation studies in computational biologyManagement of simulation studies in computational biology
Management of simulation studies in computational biology
 
Omics Logic - Bioinformatics 2.0
Omics Logic - Bioinformatics 2.0Omics Logic - Bioinformatics 2.0
Omics Logic - Bioinformatics 2.0
 
HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8HKU Data Curation MLIM7350 Class 8
HKU Data Curation MLIM7350 Class 8
 
OpenTox Europe 2013
OpenTox Europe 2013OpenTox Europe 2013
OpenTox Europe 2013
 
MOST: exploring differences between versions of models in BioModels and in th...
MOST: exploring differences between versions of models in BioModels and in th...MOST: exploring differences between versions of models in BioModels and in th...
MOST: exploring differences between versions of models in BioModels and in th...
 
The eCrystals Federation
The eCrystals FederationThe eCrystals Federation
The eCrystals Federation
 

Más de University Medicine Greifswald

Possibilities for integrating model-related data in computational biology (DI...
Possibilities for integrating model-related data in computational biology (DI...Possibilities for integrating model-related data in computational biology (DI...
Possibilities for integrating model-related data in computational biology (DI...
University Medicine Greifswald
 

Más de University Medicine Greifswald (18)

A guide to the COMBINE: Navigating through specifications, mailing lists and ...
A guide to the COMBINE: Navigating through specifications, mailing lists and ...A guide to the COMBINE: Navigating through specifications, mailing lists and ...
A guide to the COMBINE: Navigating through specifications, mailing lists and ...
 
2019 07-04-model reuse-bonn
2019 07-04-model reuse-bonn2019 07-04-model reuse-bonn
2019 07-04-model reuse-bonn
 
Mehr Medizininformatik am Meer
Mehr Medizininformatik am MeerMehr Medizininformatik am Meer
Mehr Medizininformatik am Meer
 
Implementierung Graph-basierter Ansätze für das Management systembiologischer...
Implementierung Graph-basierter Ansätze für das Management systembiologischer...Implementierung Graph-basierter Ansätze für das Management systembiologischer...
Implementierung Graph-basierter Ansätze für das Management systembiologischer...
 
Using Neo4j technologies for the management of systems biology models
Using Neo4j technologies for the management of systems biology modelsUsing Neo4j technologies for the management of systems biology models
Using Neo4j technologies for the management of systems biology models
 
Identifying pattern in reaction networks of computational models
Identifying pattern in reaction networks of computational modelsIdentifying pattern in reaction networks of computational models
Identifying pattern in reaction networks of computational models
 
Extended support for standard graphical notations of biological networks in s...
Extended support for standard graphical notations of biological networks in s...Extended support for standard graphical notations of biological networks in s...
Extended support for standard graphical notations of biological networks in s...
 
Modelling sample at SEMS from a graph perspective
Modelling sample at SEMS from a graph perspectiveModelling sample at SEMS from a graph perspective
Modelling sample at SEMS from a graph perspective
 
Coming Soon: de.NBI and SBGN-ED @ SEMS
Coming Soon: de.NBI and SBGN-ED @ SEMSComing Soon: de.NBI and SBGN-ED @ SEMS
Coming Soon: de.NBI and SBGN-ED @ SEMS
 
Masymos: Finding hidden treasures in model repositories
Masymos: Finding hidden treasures in model repositoriesMasymos: Finding hidden treasures in model repositories
Masymos: Finding hidden treasures in model repositories
 
Reproducibility, dissemination, and management of modeling results
Reproducibility, dissemination,  and management of modeling resultsReproducibility, dissemination,  and management of modeling results
Reproducibility, dissemination, and management of modeling results
 
e:Bio Kick-Off Meeting, SEMS
e:Bio Kick-Off Meeting, SEMSe:Bio Kick-Off Meeting, SEMS
e:Bio Kick-Off Meeting, SEMS
 
Possibilities for integrating model-related data in computational biology (DI...
Possibilities for integrating model-related data in computational biology (DI...Possibilities for integrating model-related data in computational biology (DI...
Possibilities for integrating model-related data in computational biology (DI...
 
SEMS: Model search and ranked Retrieval (Ron Henkel)
SEMS: Model search and ranked Retrieval (Ron Henkel)SEMS: Model search and ranked Retrieval (Ron Henkel)
SEMS: Model search and ranked Retrieval (Ron Henkel)
 
Simulation experiment descriptions and management
Simulation experiment descriptions and managementSimulation experiment descriptions and management
Simulation experiment descriptions and management
 
Sems project overview
Sems project overviewSems project overview
Sems project overview
 
Bio-Model Meta-Information and SED-ML
Bio-Model Meta-Information and SED-MLBio-Model Meta-Information and SED-ML
Bio-Model Meta-Information and SED-ML
 
Meta-Information for Bio-Models
Meta-Information for Bio-ModelsMeta-Information for Bio-Models
Meta-Information for Bio-Models
 

Último

Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
AlMamun560346
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
PirithiRaju
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
ssuser79fe74
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
Sérgio Sacani
 

Último (20)

Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Seismic Method Estimate velocity from seismic data.pptx
Seismic Method Estimate velocity from seismic  data.pptxSeismic Method Estimate velocity from seismic  data.pptx
Seismic Method Estimate velocity from seismic data.pptx
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
Vip profile Call Girls In Lonavala 9748763073 For Genuine Sex Service At Just...
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
Chemical Tests; flame test, positive and negative ions test Edexcel Internati...
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 

Data and model management in Systems Biology

  • 1. Data and model management in Systems Biology Dagmar Waltemath University of Rostock, Germany Kinetics on the move – Happy 10th anniversary to SABIO-RK! Heidelberg, 31st May, 2016 http://www.slideshare.net/dagwa/data-and-model-management-in-systems-biology
  • 2. 2 Junior research group: Management of simulation studies in systems biology Tool development: SBGN-ED for the graphical representation of networks Infrastructure: Data management for systems biology in Germany Standards and tools for model management www.sems.uni-rostock.de
  • 3. © 2009 UNIVERSITÄT ROSTOCK 3 NBI-SysBio: Data management for systems biology in Germany 3 ● Sustainable infrastructure for data management ● Access to documented and reproducible results ● Systems Biology Standards ● Tool Development ● Education www.denbi.de (training – services – jobs)
  • 4. © 2009 UNIVERSITÄT ROSTOCK 4 Photo: NY - http://nyphotographic.com (CC BY-SA 3.0) Photo: janneke staaks on flickr Fig. courtesy 10.1371/journal.pbio.1001779 TM
  • 5. © 2009 UNIVERSITÄT ROSTOCK 5 Data management is … ● Data management describes procedures and actions that help to store, preserve, organize and control the data generated during a (research) project. ● Aspects of data management include: – Data Ownership; – Metadata Compilation; – Data Lifecycle Control; – Data Quality; – Data Access and Dissemination Photo: NY - http://nyphotographic.com (CC BY-SA 3.0)
  • 6. 5/31/16 © 2009 UNIVERSITÄT ROSTOCK 6 ● Data about data ● Improved understanding of encoded data items ● Descriptive details ● Discovery and search for existing data, online browsing of data ● Standardized and structured information – Purpose, origin, time references, geographic location, creator, access conditions, and terms of use of your data collection ● Often encoded in ontologies https://www.libraries.psu.edu/psul/pubcur/what_is_dm.html#data-management Metadata
  • 7. 5/31/16 © 2009 UNIVERSITÄT ROSTOCK 7 ● Well-structured, controlled vocabularies ● Capture and convey commonly agreed definitions and concepts in a domain ● Communication across people and software tools ● Enable reuse of domain knowledge ● Make implicit domain knowledge explicit and queryable ● Bio-ontologies – Gene Ontology, ChEBI, UniProt – Systems Biology Ontology (concepts and terminology for modeling) Ontologies
  • 8. 8 Example: Definition of „cell growth“ in the Gene Ontology 5/31/16 id: GO:0016049 name: cell growth namespace: biological_process def: "The process in which a cell  irreversibly increases in size over  time by accretion and biosynthetic  production of matter similar to that  already present." synonym: "cell expansion" RELATED [] synonym: "cellular growth" EXACT [] synonym: "growth of cell" EXACT [] is_a: GO:0009987 ! cellular process is_a: GO:0040007 ! Growth relationship: part_of GO:0008361 !  regulation of cell size © 2009 UNIVERSITÄT ROSTOCK
  • 9. 5/31/16 © 2009 UNIVERSITÄT ROSTOCK 9 ● Increased confidence and trust in the data ● Better understanding of how to use the data, and of the data itself ● Better data quality ● Coherent data when standards are used ● Improved business processes (saving time, guaranteeing high quality) ● Improved access to data and improved reproducibility ● Better exploitation of data through easier data exchange and integration Advantages of careful & planned data management
  • 10. 5/31/16 © 2009 UNIVERSITÄT ROSTOCK 10 ● Reusable ● Exchangeable ● Interoperable ● Long-term available (in open repositories) ● Curateable ● Shareable Advantages of standardised data
  • 11. 5/31/16 © 2009 UNIVERSITÄT ROSTOCK 11 Photo: janneke staaks on flickr
  • 12. 5/31/16 © 2009 UNIVERSITÄT ROSTOCK 12 Research data in the modeling life cycle Models equations, parameters, data tables Ideas text, drawings Experimental results text, data tables Publications text, figures Analyses configuration files, data tables Fig. courtesy Martin Scharm (adapted)
  • 13. 5/31/16 © 2009 UNIVERSITÄT ROSTOCK 13 Research data in the modeling life cycle ● Mathematical formulae ● Networks, diagrams ● Image data ● Publications ● Experiment descriptions ● Experimental results (both lab and simulation) ● Definitions of things (e.g., gene functions, chemical structures...) Figures top to bottom: (1) By Noah A. Rosenberget al. Slightly modified by User:Wobble. - Public Library of Science, CC BY 3.0, https://commons.wikimedia.org/w/index.php?curid=2839383; (2) By http://rsb.info.nih.gov/ij/images/, Public Domain, https://commons.wikimedia.org/w/index.php?curid=655748; (3) BIOM005, generated using CellDesigner 4, (4,5) PMID:18669651
  • 14. 5/31/16 © 2009 UNIVERSITÄT ROSTOCK 14 ● Heterogenuous ● Highly connected ● Context-dependent ● Distributed ● Big Research data in the modeling life cycle Figures top to bottom: (1) By Noah A. Rosenberget al. Slightly modified by User:Wobble. - Public Library of Science, CC BY 3.0, https://commons.wikimedia.org/w/index.php?curid=2839383; (2) By http://rsb.info.nih.gov/ij/images/, Public Domain, https://commons.wikimedia.org/w/index.php?curid=655748; (3) BIOM005, generated using CellDesigner 4, (4,5) PMID:18669651
  • 15. 5/31/16 © 2009 UNIVERSITÄT ROSTOCK 15 The model ● Mathematical equations ● Biological entities ● Kinetic information ● Encoding: & semantic annotations TM <bqmodel:isDescribedBy> <rdf:Bag> <rdf:li rdf:resource="http://identifiers.org/pubmed/18669651"/> </rdf:Bag> </bqmodel:isDescribedBy> <parameter id="parameter_49" name="L" metaid="metaid_0000078" value="20670"/>
  • 16. 5/31/16 © 2009 UNIVERSITÄT ROSTOCK 16 SBML – Standard for model encoding ● Systems Biology Markup Language ● Community-driven de-facto Standard ● Free & open source: www.sbml.org ● Supported by many organizations and tools ● Encodes computational models of biological processes (compartments – species – reactions - parameters)
  • 17. 5/31/16 © 2009 UNIVERSITÄT ROSTOCK 17 SBGN – Standard for visual representation ● Systems Biology Graphical Notation ● Standardised glyphs for biological entities ● Three levels – SBGN-AF | SBGN-ER | SBGN-PD ● Free & open source: www.sbgn.org ● Tool support ● Interpretable Format: SBGN-ML Fig.: http:sbgn.org
  • 18. 5/31/16 © 2009 UNIVERSITÄT ROSTOCK 18 Fig.: SBGN map for BIOM183, CellDesigner SBGN – Standard for visual representation Fig.: SBGN map for BIOM005, CellDesigner
  • 19. 5/31/16 © 2009 UNIVERSITÄT ROSTOCK 19 ● Reproduce behaviour of the model ● Publish and share virtualexperiments – Simulation setup / conditions – Pre- and post-processing – Observations ● Encoding: & & result data in Excel, CSV files <listOfSimulations> <uniformTimeCourse id="sim1" initialTime="0" outputStartTime="0" outputEndTime="100" numberOfPoints="100"> <algorithm kisaoID="KISAO:0000019"/> </uniformTimeCourse> </listOfSimulations> The analysis Fig. M. Stefan et al, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2596252/
  • 20. 5/31/16 © 2009 UNIVERSITÄT ROSTOCK 20 SED-ML – Standard for model analysis ● Links to models used in an analysis ● Pre- and Post-processing of models ● Type of simulation ● Definition of output ● Free an open source: www.sed-ml.org ● Tool support →Showcase your tool support online ←
  • 21. 5/31/16 © 2009 UNIVERSITÄT ROSTOCK 21 SED-ML – Standard for model analysis Fig. M. Stefan et al, http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2596252/ Simulation of BIOM183 in SED-ML Web Tools without simulation description
  • 22. 5/31/16 © 2009 UNIVERSITÄT ROSTOCK 22 m n Coordinate annual meetings Simulation GuidelinesOntologies - Next HARMONY: Auckland, June 7-11, 2016 - Next COMBINE: Newcastle, Sep 19-23, 2016 Coordinate standards development - Common procedures - Interoperable software tools - Discussion forums, mailing lists... Represent community - Funders - Other communities Provide standards resources - Single entry point - Resolvable URI - Web infrastructure
  • 23. Standard-compliant software tools for modeling 5/31/16 © 2009 UNIVERSITÄT ROSTOCK 23 The path2models project integrated data from different databases into more than 140.000 SBML models. Fig.: Büchel et al BMC Sys Biol (2013)http://www.ebi.ac.uk/biomodels-main/path2models
  • 24. 5/31/16 © 2009 UNIVERSITÄT ROSTOCK 24 The Systems Biology Workbench is a software framework to help heterogeneous application components communicate with each other. Modeling Editing Simulating Analysinghttp://sbw.sourceforge.net Standard-compliant software tools for modeling
  • 25. 5/31/16 © 2009 UNIVERSITÄT ROSTOCK 25 The decision whether and how to share data often rests with researchers. Roche DG, Lanfear R, Binning SA, Haff TM, Schwanz LE, et al. (2014) Troubleshooting Public Data Archiving: Suggestions to Increase Participation. PLoS Biol 12(1): e1001779. doi:10.1371/journal.pbio.1001779
  • 26. 5/31/16 © 2009 UNIVERSITÄT ROSTOCK 26 ● Bundling files ● Shipping results ● Exchanging data ● Keeping provenance ● Encoding: zip-like file with a manifest (meta-data) ● Generate, modify & share through WebCAT COMBINE Archive
  • 27. 5/31/16 © 2009 UNIVERSITÄT ROSTOCK 27 COMBINE Archive Original publication SBGN map SBML model versions SED-ML files Open in Webcat Open in SEEK
  • 28. 5/31/16 © 2009 UNIVERSITÄT ROSTOCK 28 Model curation & publication
  • 29. 5/31/16 © 2009 UNIVERSITÄT ROSTOCK 29 Model curation & publication
  • 30. 5/31/16 © 2009 UNIVERSITÄT ROSTOCK 30 Model curation, simulation & publication
  • 31. 5/31/16 © 2009 UNIVERSITÄT ROSTOCK 31 Introduction to SEEK & FAIRDOM by Olga Krebs.
  • 32. 32 Thank you for your attention. http://www.denbi.de/ @SemsProject m nhttp://co.mbine.org