SlideShare a Scribd company logo
1 of 32
Using Neo4j technologies for
the management of systems
biology models
Ron Henkel (HITS gGmbH, Heidelberg)
Dagmar Waltemath (Rostock)
Neo4j Life & Health Sciences Day - Berlin, 21st June, 2017
Computational Systems Biology
Biological scales DE Systems Further approaches
Images: https://doi.org/10.1002/wsbm.33, https://doi.org/10.1371/journal.pcbi.1002815, https://doi.org/10.1371/journal.pcbi.1004591
Data
Forest (decorticated) Path (accessible)
Matlab logo: By Jarekt (Own work) [Public domain], via Wikimedia Commons; Python logo: By www.python.org [GPL, via Wikimedia Commons]; Java logo: By Cguevara94 (Own work) [CC
BY-SA 4.0], via Wikimedia Commons, modified. Images: https://pixabay.com/de/urwald-lianen-dschungel-b%C3%A4ume-406780/, https://pixabay.com/de/buchenwald-st%C3%A4mme-
buchenst%C3%A4mme-318347/, https://pixabay.com/de/herbst-bl%C3%A4ttern-spur-laub-1432252/
Coppic
Challenges
Storage & retrieval
Storing simulation
studies and networks
• Large data items
• Heterogeneous
• Highly-connected
• Context-dependent
• Distributed
Provenance
Following the evolution
of models
• Error correction
• Computational power
• Evolution of biological
knowledge
• Contradicting
hypotheses
Integration
Integrating models; or
models and data
• Size of models
• Incorporation of
health data
• Security and access
rights
SEMS
@SemsProject
Tools and methods for the
management of simulation
studies in systems biology
2011-2017 BMBF e:Bio
2015-2017 BMBF de.NBI
SEMS
Selected projects
1. Integrated storage of models
and simulation studies
2. Ranked retrieval
3. Identification of frequent pattern
Let’s move from relational databases
to graph databases and see if we can
improve model retrieval, simulation
analysis and model integration.
2011-2017 BMBF e:Bio
2015-2017 BMBF de.NBI
Integrated storage of models and
simulation studies
Figures: Rateitschak et al. (2012) https://doi.org/10.1371/journal.pcbi.1002815
A closer look at the data
Original figure: Martin Scharm, Martin Peters (SEMS)
Models
<species id="C_p" sboTerm="SBO:0000247">
<annotation>
<rdf:Description rdf:about="C_p">
<bqbiol:is>
<rdf:Bag>
<rdf:li rdf:resource=
"urn:miriam:obo.chebi:CHEBI%3A27732"/>
</rdf:Bag>
</bqbiol:is>
<bqbiol:is>
<rdf:Bag>
<rdf:li rdf:resource=
"urn:miriam:kegg.compound:C07481"/>
</rdf:Bag>
</bqbiol:is>
</rdf:Description>
</annotation>
</species>
Original figure: Martin Scharm, Martin Peters (SEMS)
Visual representation
Original figure: Martin Scharm, Martin Peters (SEMS)
Annotation
enzyme
enzyme
product
substrate
enzymatic rate law
catalytic rate constant
urn:miriam:SBO:0000011
urn:miriam:SBO:0000014
urn:miriam:SBO:0000014
urn:miriam:SBO:0000025
urn:miriam:SBO:0000015
Original figure: Martin Scharm, Martin Peters (SEMS)
Annotation
Tyrosine
Phenylalanine-
4-hydroxylase
Phenylalanine-
4-hydroxylase
Tetrahydrobiopterin
urn:miriam:uniprot:P00439
urn:miriam:uniprot:Q03393
urn:miriam:uniprot:P07101
urn:miriam:uniprot:P00439
Original figure: Martin Scharm, Martin Peters (SEMS)
Publication
Original figure: Martin Scharm, Martin Peters (SEMS)
<dataGenerator id="gCP" name="gCP">
<listOfVariables>
<variable id="CP" name="CP"
taskReference="task1"
target="/[..]/sbml:species[@id='CP']" />
</listOfVariables>
</dataGenerator>
[…]
<listOfOutputs>
<plot2D id="plot1">
<listOfCurves>
<curve id="curve_0" logX="false"
logY="false" xDataReference="time"
yDataReference="CP" />
</listOfCurves>
</plot2D>
</listOfOutputs>
Simulation
Original figure: Martin Scharm, Martin Peters (SEMS)
Document
SEDML
Modelrefere
nce
Output
Datagenera
tor
Simulation Task
Variable
Variable
Document
Tyson_1991
C2 CP
time
environment
isDescribedBy Pubmed:
1831270
time timeCPC2 CP C2
is_connected is_connected
is_mapped_to
is_connected
Document
Tyson1991
Cell Cycle 6
var
C2 pM CellReaction3 CP
Uniprot:P04551 Uniprot:P04551 GO:0005623
Interpro:
IPR006670
isVersionOf
isVersion
hasPart
is
asProduct
asReactant isContainedIn
Pubmed:
1831270
Kegg Pathway
sce04111
isDescribedBy
is
EC-Code:
3.1.3.16
isVersionOf
MASYMOS
Example: Tyson 1991, BIOM000000005
SBO:
Ontology
SBO:0000
SBO:544 SBO:236SBO:231
isA
SBO:064 SBO:545SBO:004 SBO:003
Models Simulation Annotation
MASYMOS
• Mapping on graph structure
• Linking
 Annotation terms to ontology terms
 Simulation variables to model entities
 Publication to model
 Model entities across model files
• Advantage
 Structure can be queried across domains
 Aggregation and analysis is possible
Example: Tyson 1991, BIOM000000005
MASYMOS Model
Publication
Annotation
Person
Simulation
Document
Tyson1991
Cell Cycle 6
var
C2 pM CellReaction3 CP
Uniprot:P04551 Uniprot:P04551 GO:0005623
Interpro:
IPR006670
isVersionOf
isVersion
hasPart
is
asProduct
asReactant isContainedIn
Pubmed:
1831270
Kegg Pathway
sce04111
isDescribedBy
is
EC-Code:
3.1.3.16
isVersionOf
Document
SEDML
Modelrefere
nce
Output
Datagenera
tor
Simulation Task
Variable
Variable
Document
Tyson_1991
C2 CP
time
environment
isDescribedBy Pubmed:
1831270
time timeCPC2 CP C2
is_connected is_connected
is_mapped_to
is_connected
SBO:
Ontology
SBO:0000
SBO:544 SBO:236SBO:231
isA
SBO:064 SBO:545SBO:004 SBO:003
 Id
 Name
 Title
 Journal
 Abstract
 Authors
 …
 Id
 Name
 Component
 Variable
 Species
 Reaction
 Compartment First name
 Last name
 Organization
 Email
 URI
 Description
STON: SBGN to Neo4j
Implementation: Vasundra Touré, https://sourceforge.net/projects/ston. Image: Touré et al. (2016) https://doi.org/10.1186/s12859-016-1394x
STON: Features
Identification of submodules Model linking
Implementation: Vasundra Touré, https://sourceforge.net/projects/ston. Image: Touré et al. (2016) https://doi.org/10.1186/s12859-016-1394x
Ranked retrieval
MORRE
Implementation: Ron Henkel, https://github.com/ronhenkel/masymos-morre. Image: Henkel et al. (2010) https://doi.org/10.1186/1471-2105-11-423
Document
Tyson1991
Cell Cycle 6
var
C2 pM CellReaction3 CP
Uniprot:P04551 Uniprot:P04551 GO:0005623
Interpro:
IPR006670
isVersionOf
isVersion
hasPart
is
asProduct
asReactant isContainedIn
Pubmed:
1831270
Kegg Pathway
sce04111
isDescribedBy
is
EC-Code:
3.1.3.16
isVersionOf
MORRE
Annotation
Person
Show me models by
Tyson describing the
cell cycle and having
cdc2
1. (0.859) Tyson1991 - Cell Cycle 6 var
2. (0.854) Tyson2001_Cell_Cycle_Regulation
3. (0.477) Chen2004 - Cell Cycle Regulation
Applications
Implementation: Martin Peters, Martin Scharm, Mariam Nassar. M2cat: http://m2CAT.sems.uni-rostock.de, CombineArchiveWeb: http://webcat.sems.uni-rostock.de.
Applications
Implementation: Ron Henkel, Tommy Yu; CellML Model Repository: https://models.cellml.org/cellml, Seek: http://seek4science.org/ (work in progress)
Identification of frequent pattern
using graph-mining
Workflow
Implementation: Fabienne Lambusch. Figure: Lambusch et al. (in preparation). Preprint: https://peerj.com/preprints/1479
Reactions types found in BioModels
Implementation: Fabienne Lambusch. Figure: Lambusch et al. (in preparation). Preprint: https://peerj.com/preprints/1479
Identified motifs
Implementation: Fabienne Lambusch. Figure: Lambusch et al. (in preparation). Preprint: https://peerj.com/preprints/1479
Summary
All code under public licenses:
MASYMOS
MORRE
STON
Pattern detection
MOST (change statistics)
M2CAT
COMBINE Archive Web
• Java based tools
• Neo4J graph database
• Parser for each format
• Reuse of existing libraries / tools
• jLibSBML
• jSedML
• Miriam Web Services (EBI)
• Apache Commons
• GSON
• Owl-api
• BiVeS-CellML
Future work
Future work Partners?
- Incorporating health-related data to explore the behavior of models under
varying health conditions
- More applications for MASYMOS
- Incorporating more ontologies and finding better similarity scores.
- Reducing the conglomeration of tools.
The team
More @ https://sems.uni-rostock.de
Left to right: Fabienne Lambusch, Martin Scharm, Dagmar Waltemath,
Mariam Nassar, Tom Gebhardt, Martin Peters, Vasundra Touré, Ron Henkel
Impact
SEMS is part of a large
systems biology community.
Join us. It‘s fun.
http://www.denbi.de
http://co.mbine.org

More Related Content

Similar to Using Neo4j technologies for the management of systems biology models

ICAR 2015 Workshop - Nick Provart
ICAR 2015 Workshop - Nick ProvartICAR 2015 Workshop - Nick Provart
ICAR 2015 Workshop - Nick ProvartAraport
 
Model repositories and standard formats for model reusability
Model repositories and standard formats for model reusabilityModel repositories and standard formats for model reusability
Model repositories and standard formats for model reusabilityUniversity Medicine Greifswald
 
Masymos: Finding hidden treasures in model repositories
Masymos: Finding hidden treasures in model repositoriesMasymos: Finding hidden treasures in model repositories
Masymos: Finding hidden treasures in model repositoriesUniversity Medicine Greifswald
 
Metatron 4027gr clinicaltm what why it and price
Metatron 4027gr clinicaltm what why it and priceMetatron 4027gr clinicaltm what why it and price
Metatron 4027gr clinicaltm what why it and pricechangyun luo
 
Bioinformatica 29-09-2011-t1-bioinformatics
Bioinformatica 29-09-2011-t1-bioinformaticsBioinformatica 29-09-2011-t1-bioinformatics
Bioinformatica 29-09-2011-t1-bioinformaticsProf. Wim Van Criekinge
 
Cross-Kingdom Standards in Genomics, Epigenomics and Metagenomics
Cross-Kingdom Standards in Genomics, Epigenomics and MetagenomicsCross-Kingdom Standards in Genomics, Epigenomics and Metagenomics
Cross-Kingdom Standards in Genomics, Epigenomics and Metagenomics Christopher Mason
 
Grafström - Lush Prize Conference 2014
Grafström - Lush Prize Conference 2014Grafström - Lush Prize Conference 2014
Grafström - Lush Prize Conference 2014LushPrize
 
Standarization in Proteomics: From raw data to metadata files
Standarization in Proteomics: From raw data to metadata filesStandarization in Proteomics: From raw data to metadata files
Standarization in Proteomics: From raw data to metadata filesYasset Perez-Riverol
 
How can you access PubChem programmatically?
How can you access PubChem programmatically?How can you access PubChem programmatically?
How can you access PubChem programmatically?Sunghwan Kim
 
SureChEMBL and Open PHACTS
SureChEMBL and Open PHACTSSureChEMBL and Open PHACTS
SureChEMBL and Open PHACTSGeorge Papadatos
 
Software Pipelines: The Good, The Bad and The Ugly
Software Pipelines: The Good, The Bad and The UglySoftware Pipelines: The Good, The Bad and The Ugly
Software Pipelines: The Good, The Bad and The UglyJoão André Carriço
 
Group meeting in Manchester.
Group meeting in Manchester.Group meeting in Manchester.
Group meeting in Manchester.Martin Scharm
 
Cool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical ResearchCool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical ResearchDavid Ruau
 
A Global Commons for Scientific Data: Molecules and Wikidata
A Global Commons for Scientific Data: Molecules and WikidataA Global Commons for Scientific Data: Molecules and Wikidata
A Global Commons for Scientific Data: Molecules and Wikidatapetermurrayrust
 
Power point presentation for science research
Power point presentation for science researchPower point presentation for science research
Power point presentation for science researchSatish Bhat
 
2014 Taverna Tutorial Introduction to eScience and workflows
2014 Taverna Tutorial Introduction to eScience and workflows2014 Taverna Tutorial Introduction to eScience and workflows
2014 Taverna Tutorial Introduction to eScience and workflowsmyGrid team
 
Using Polycaprolactone for Tissue Regeneration
Using Polycaprolactone for Tissue RegenerationUsing Polycaprolactone for Tissue Regeneration
Using Polycaprolactone for Tissue RegenerationSatish Bhat
 
Data Integration vs Transparency: Tackling the tension
Data Integration vs Transparency: Tackling the tensionData Integration vs Transparency: Tackling the tension
Data Integration vs Transparency: Tackling the tensionPaul Groth
 

Similar to Using Neo4j technologies for the management of systems biology models (20)

ICAR 2015 Workshop - Nick Provart
ICAR 2015 Workshop - Nick ProvartICAR 2015 Workshop - Nick Provart
ICAR 2015 Workshop - Nick Provart
 
Delivering The Benefits of Chemical-Biological Integration in Computational T...
Delivering The Benefits of Chemical-Biological Integration in Computational T...Delivering The Benefits of Chemical-Biological Integration in Computational T...
Delivering The Benefits of Chemical-Biological Integration in Computational T...
 
Model repositories and standard formats for model reusability
Model repositories and standard formats for model reusabilityModel repositories and standard formats for model reusability
Model repositories and standard formats for model reusability
 
Masymos: Finding hidden treasures in model repositories
Masymos: Finding hidden treasures in model repositoriesMasymos: Finding hidden treasures in model repositories
Masymos: Finding hidden treasures in model repositories
 
Metatron 4027gr clinicaltm what why it and price
Metatron 4027gr clinicaltm what why it and priceMetatron 4027gr clinicaltm what why it and price
Metatron 4027gr clinicaltm what why it and price
 
FAIR data management in biomedicine
FAIR data management  in biomedicineFAIR data management  in biomedicine
FAIR data management in biomedicine
 
Bioinformatica 29-09-2011-t1-bioinformatics
Bioinformatica 29-09-2011-t1-bioinformaticsBioinformatica 29-09-2011-t1-bioinformatics
Bioinformatica 29-09-2011-t1-bioinformatics
 
Cross-Kingdom Standards in Genomics, Epigenomics and Metagenomics
Cross-Kingdom Standards in Genomics, Epigenomics and MetagenomicsCross-Kingdom Standards in Genomics, Epigenomics and Metagenomics
Cross-Kingdom Standards in Genomics, Epigenomics and Metagenomics
 
Grafström - Lush Prize Conference 2014
Grafström - Lush Prize Conference 2014Grafström - Lush Prize Conference 2014
Grafström - Lush Prize Conference 2014
 
Standarization in Proteomics: From raw data to metadata files
Standarization in Proteomics: From raw data to metadata filesStandarization in Proteomics: From raw data to metadata files
Standarization in Proteomics: From raw data to metadata files
 
How can you access PubChem programmatically?
How can you access PubChem programmatically?How can you access PubChem programmatically?
How can you access PubChem programmatically?
 
SureChEMBL and Open PHACTS
SureChEMBL and Open PHACTSSureChEMBL and Open PHACTS
SureChEMBL and Open PHACTS
 
Software Pipelines: The Good, The Bad and The Ugly
Software Pipelines: The Good, The Bad and The UglySoftware Pipelines: The Good, The Bad and The Ugly
Software Pipelines: The Good, The Bad and The Ugly
 
Group meeting in Manchester.
Group meeting in Manchester.Group meeting in Manchester.
Group meeting in Manchester.
 
Cool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical ResearchCool Informatics Tools and Services for Biomedical Research
Cool Informatics Tools and Services for Biomedical Research
 
A Global Commons for Scientific Data: Molecules and Wikidata
A Global Commons for Scientific Data: Molecules and WikidataA Global Commons for Scientific Data: Molecules and Wikidata
A Global Commons for Scientific Data: Molecules and Wikidata
 
Power point presentation for science research
Power point presentation for science researchPower point presentation for science research
Power point presentation for science research
 
2014 Taverna Tutorial Introduction to eScience and workflows
2014 Taverna Tutorial Introduction to eScience and workflows2014 Taverna Tutorial Introduction to eScience and workflows
2014 Taverna Tutorial Introduction to eScience and workflows
 
Using Polycaprolactone for Tissue Regeneration
Using Polycaprolactone for Tissue RegenerationUsing Polycaprolactone for Tissue Regeneration
Using Polycaprolactone for Tissue Regeneration
 
Data Integration vs Transparency: Tackling the tension
Data Integration vs Transparency: Tackling the tensionData Integration vs Transparency: Tackling the tension
Data Integration vs Transparency: Tackling the tension
 

More from University Medicine Greifswald

A guide to the COMBINE: Navigating through specifications, mailing lists and ...
A guide to the COMBINE: Navigating through specifications, mailing lists and ...A guide to the COMBINE: Navigating through specifications, mailing lists and ...
A guide to the COMBINE: Navigating through specifications, mailing lists and ...University Medicine Greifswald
 
COMBINE standards & tools: Getting model management right
COMBINE standards & tools: Getting model management rightCOMBINE standards & tools: Getting model management right
COMBINE standards & tools: Getting model management rightUniversity Medicine Greifswald
 
Adding value to scientific results: COMBINE standards & guidelines for system...
Adding value to scientific results: COMBINE standards & guidelines for system...Adding value to scientific results: COMBINE standards & guidelines for system...
Adding value to scientific results: COMBINE standards & guidelines for system...University Medicine Greifswald
 
Implementierung Graph-basierter Ansätze für das Management systembiologischer...
Implementierung Graph-basierter Ansätze für das Management systembiologischer...Implementierung Graph-basierter Ansätze für das Management systembiologischer...
Implementierung Graph-basierter Ansätze für das Management systembiologischer...University Medicine Greifswald
 
Model management tools for improved reproducibility in systems biology
Model management tools for improved reproducibility in systems biologyModel management tools for improved reproducibility in systems biology
Model management tools for improved reproducibility in systems biologyUniversity Medicine Greifswald
 
Standards and tools for model management in biomedical research
Standards and tools for model management in biomedical researchStandards and tools for model management in biomedical research
Standards and tools for model management in biomedical researchUniversity Medicine Greifswald
 
Identifying pattern in reaction networks of computational models
Identifying pattern in reaction networks of computational modelsIdentifying pattern in reaction networks of computational models
Identifying pattern in reaction networks of computational modelsUniversity Medicine Greifswald
 
Management of simulation studies in computational biology
Management of simulation studies in computational biologyManagement of simulation studies in computational biology
Management of simulation studies in computational biologyUniversity Medicine Greifswald
 
Extended support for standard graphical notations of biological networks in s...
Extended support for standard graphical notations of biological networks in s...Extended support for standard graphical notations of biological networks in s...
Extended support for standard graphical notations of biological networks in s...University Medicine Greifswald
 
Reproducibility, dissemination, and management of modeling results
Reproducibility, dissemination,  and management of modeling resultsReproducibility, dissemination,  and management of modeling results
Reproducibility, dissemination, and management of modeling resultsUniversity Medicine Greifswald
 

More from University Medicine Greifswald (20)

A guide to the COMBINE: Navigating through specifications, mailing lists and ...
A guide to the COMBINE: Navigating through specifications, mailing lists and ...A guide to the COMBINE: Navigating through specifications, mailing lists and ...
A guide to the COMBINE: Navigating through specifications, mailing lists and ...
 
When is a model FAIR – and why should we care?
When is a model FAIR – and why should we care?When is a model FAIR – and why should we care?
When is a model FAIR – and why should we care?
 
COMBINE standards & tools: Getting model management right
COMBINE standards & tools: Getting model management rightCOMBINE standards & tools: Getting model management right
COMBINE standards & tools: Getting model management right
 
Adding value to scientific results: COMBINE standards & guidelines for system...
Adding value to scientific results: COMBINE standards & guidelines for system...Adding value to scientific results: COMBINE standards & guidelines for system...
Adding value to scientific results: COMBINE standards & guidelines for system...
 
2019 07-04-model reuse-bonn
2019 07-04-model reuse-bonn2019 07-04-model reuse-bonn
2019 07-04-model reuse-bonn
 
Mehr Medizininformatik am Meer
Mehr Medizininformatik am MeerMehr Medizininformatik am Meer
Mehr Medizininformatik am Meer
 
Implementierung Graph-basierter Ansätze für das Management systembiologischer...
Implementierung Graph-basierter Ansätze für das Management systembiologischer...Implementierung Graph-basierter Ansätze für das Management systembiologischer...
Implementierung Graph-basierter Ansätze für das Management systembiologischer...
 
Model management for systems biology projects
Model management for systems biology projectsModel management for systems biology projects
Model management for systems biology projects
 
Model management tools for improved reproducibility in systems biology
Model management tools for improved reproducibility in systems biologyModel management tools for improved reproducibility in systems biology
Model management tools for improved reproducibility in systems biology
 
Short introduction to SED-ML
Short introduction to SED-MLShort introduction to SED-ML
Short introduction to SED-ML
 
Data and model management in Systems Biology
Data and model management in Systems BiologyData and model management in Systems Biology
Data and model management in Systems Biology
 
Standards and tools for model management in biomedical research
Standards and tools for model management in biomedical researchStandards and tools for model management in biomedical research
Standards and tools for model management in biomedical research
 
Data and Model Management for Systems Biology
Data and Model Management  for Systems BiologyData and Model Management  for Systems Biology
Data and Model Management for Systems Biology
 
Identifying pattern in reaction networks of computational models
Identifying pattern in reaction networks of computational modelsIdentifying pattern in reaction networks of computational models
Identifying pattern in reaction networks of computational models
 
Management of simulation studies in computational biology
Management of simulation studies in computational biologyManagement of simulation studies in computational biology
Management of simulation studies in computational biology
 
Extended support for standard graphical notations of biological networks in s...
Extended support for standard graphical notations of biological networks in s...Extended support for standard graphical notations of biological networks in s...
Extended support for standard graphical notations of biological networks in s...
 
Modelling sample at SEMS from a graph perspective
Modelling sample at SEMS from a graph perspectiveModelling sample at SEMS from a graph perspective
Modelling sample at SEMS from a graph perspective
 
Coming Soon: de.NBI and SBGN-ED @ SEMS
Coming Soon: de.NBI and SBGN-ED @ SEMSComing Soon: de.NBI and SBGN-ED @ SEMS
Coming Soon: de.NBI and SBGN-ED @ SEMS
 
Reproducibility, dissemination, and management of modeling results
Reproducibility, dissemination,  and management of modeling resultsReproducibility, dissemination,  and management of modeling results
Reproducibility, dissemination, and management of modeling results
 
e:Bio Kick-Off Meeting, SEMS
e:Bio Kick-Off Meeting, SEMSe:Bio Kick-Off Meeting, SEMS
e:Bio Kick-Off Meeting, SEMS
 

Recently uploaded

Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 

Recently uploaded (20)

Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 

Using Neo4j technologies for the management of systems biology models

  • 1. Using Neo4j technologies for the management of systems biology models Ron Henkel (HITS gGmbH, Heidelberg) Dagmar Waltemath (Rostock) Neo4j Life & Health Sciences Day - Berlin, 21st June, 2017
  • 2. Computational Systems Biology Biological scales DE Systems Further approaches Images: https://doi.org/10.1002/wsbm.33, https://doi.org/10.1371/journal.pcbi.1002815, https://doi.org/10.1371/journal.pcbi.1004591
  • 3. Data Forest (decorticated) Path (accessible) Matlab logo: By Jarekt (Own work) [Public domain], via Wikimedia Commons; Python logo: By www.python.org [GPL, via Wikimedia Commons]; Java logo: By Cguevara94 (Own work) [CC BY-SA 4.0], via Wikimedia Commons, modified. Images: https://pixabay.com/de/urwald-lianen-dschungel-b%C3%A4ume-406780/, https://pixabay.com/de/buchenwald-st%C3%A4mme- buchenst%C3%A4mme-318347/, https://pixabay.com/de/herbst-bl%C3%A4ttern-spur-laub-1432252/ Coppic
  • 4. Challenges Storage & retrieval Storing simulation studies and networks • Large data items • Heterogeneous • Highly-connected • Context-dependent • Distributed Provenance Following the evolution of models • Error correction • Computational power • Evolution of biological knowledge • Contradicting hypotheses Integration Integrating models; or models and data • Size of models • Incorporation of health data • Security and access rights
  • 5. SEMS @SemsProject Tools and methods for the management of simulation studies in systems biology 2011-2017 BMBF e:Bio 2015-2017 BMBF de.NBI
  • 6. SEMS Selected projects 1. Integrated storage of models and simulation studies 2. Ranked retrieval 3. Identification of frequent pattern Let’s move from relational databases to graph databases and see if we can improve model retrieval, simulation analysis and model integration. 2011-2017 BMBF e:Bio 2015-2017 BMBF de.NBI
  • 7. Integrated storage of models and simulation studies Figures: Rateitschak et al. (2012) https://doi.org/10.1371/journal.pcbi.1002815
  • 8. A closer look at the data Original figure: Martin Scharm, Martin Peters (SEMS)
  • 9. Models <species id="C_p" sboTerm="SBO:0000247"> <annotation> <rdf:Description rdf:about="C_p"> <bqbiol:is> <rdf:Bag> <rdf:li rdf:resource= "urn:miriam:obo.chebi:CHEBI%3A27732"/> </rdf:Bag> </bqbiol:is> <bqbiol:is> <rdf:Bag> <rdf:li rdf:resource= "urn:miriam:kegg.compound:C07481"/> </rdf:Bag> </bqbiol:is> </rdf:Description> </annotation> </species> Original figure: Martin Scharm, Martin Peters (SEMS)
  • 10. Visual representation Original figure: Martin Scharm, Martin Peters (SEMS)
  • 11. Annotation enzyme enzyme product substrate enzymatic rate law catalytic rate constant urn:miriam:SBO:0000011 urn:miriam:SBO:0000014 urn:miriam:SBO:0000014 urn:miriam:SBO:0000025 urn:miriam:SBO:0000015 Original figure: Martin Scharm, Martin Peters (SEMS)
  • 13. Publication Original figure: Martin Scharm, Martin Peters (SEMS)
  • 14. <dataGenerator id="gCP" name="gCP"> <listOfVariables> <variable id="CP" name="CP" taskReference="task1" target="/[..]/sbml:species[@id='CP']" /> </listOfVariables> </dataGenerator> […] <listOfOutputs> <plot2D id="plot1"> <listOfCurves> <curve id="curve_0" logX="false" logY="false" xDataReference="time" yDataReference="CP" /> </listOfCurves> </plot2D> </listOfOutputs> Simulation Original figure: Martin Scharm, Martin Peters (SEMS)
  • 15. Document SEDML Modelrefere nce Output Datagenera tor Simulation Task Variable Variable Document Tyson_1991 C2 CP time environment isDescribedBy Pubmed: 1831270 time timeCPC2 CP C2 is_connected is_connected is_mapped_to is_connected Document Tyson1991 Cell Cycle 6 var C2 pM CellReaction3 CP Uniprot:P04551 Uniprot:P04551 GO:0005623 Interpro: IPR006670 isVersionOf isVersion hasPart is asProduct asReactant isContainedIn Pubmed: 1831270 Kegg Pathway sce04111 isDescribedBy is EC-Code: 3.1.3.16 isVersionOf MASYMOS Example: Tyson 1991, BIOM000000005 SBO: Ontology SBO:0000 SBO:544 SBO:236SBO:231 isA SBO:064 SBO:545SBO:004 SBO:003 Models Simulation Annotation
  • 16. MASYMOS • Mapping on graph structure • Linking  Annotation terms to ontology terms  Simulation variables to model entities  Publication to model  Model entities across model files • Advantage  Structure can be queried across domains  Aggregation and analysis is possible Example: Tyson 1991, BIOM000000005
  • 17. MASYMOS Model Publication Annotation Person Simulation Document Tyson1991 Cell Cycle 6 var C2 pM CellReaction3 CP Uniprot:P04551 Uniprot:P04551 GO:0005623 Interpro: IPR006670 isVersionOf isVersion hasPart is asProduct asReactant isContainedIn Pubmed: 1831270 Kegg Pathway sce04111 isDescribedBy is EC-Code: 3.1.3.16 isVersionOf Document SEDML Modelrefere nce Output Datagenera tor Simulation Task Variable Variable Document Tyson_1991 C2 CP time environment isDescribedBy Pubmed: 1831270 time timeCPC2 CP C2 is_connected is_connected is_mapped_to is_connected SBO: Ontology SBO:0000 SBO:544 SBO:236SBO:231 isA SBO:064 SBO:545SBO:004 SBO:003  Id  Name  Title  Journal  Abstract  Authors  …  Id  Name  Component  Variable  Species  Reaction  Compartment First name  Last name  Organization  Email  URI  Description
  • 18. STON: SBGN to Neo4j Implementation: Vasundra Touré, https://sourceforge.net/projects/ston. Image: Touré et al. (2016) https://doi.org/10.1186/s12859-016-1394x
  • 19. STON: Features Identification of submodules Model linking Implementation: Vasundra Touré, https://sourceforge.net/projects/ston. Image: Touré et al. (2016) https://doi.org/10.1186/s12859-016-1394x
  • 21. MORRE Implementation: Ron Henkel, https://github.com/ronhenkel/masymos-morre. Image: Henkel et al. (2010) https://doi.org/10.1186/1471-2105-11-423
  • 22. Document Tyson1991 Cell Cycle 6 var C2 pM CellReaction3 CP Uniprot:P04551 Uniprot:P04551 GO:0005623 Interpro: IPR006670 isVersionOf isVersion hasPart is asProduct asReactant isContainedIn Pubmed: 1831270 Kegg Pathway sce04111 isDescribedBy is EC-Code: 3.1.3.16 isVersionOf MORRE Annotation Person Show me models by Tyson describing the cell cycle and having cdc2 1. (0.859) Tyson1991 - Cell Cycle 6 var 2. (0.854) Tyson2001_Cell_Cycle_Regulation 3. (0.477) Chen2004 - Cell Cycle Regulation
  • 23. Applications Implementation: Martin Peters, Martin Scharm, Mariam Nassar. M2cat: http://m2CAT.sems.uni-rostock.de, CombineArchiveWeb: http://webcat.sems.uni-rostock.de.
  • 24. Applications Implementation: Ron Henkel, Tommy Yu; CellML Model Repository: https://models.cellml.org/cellml, Seek: http://seek4science.org/ (work in progress)
  • 25. Identification of frequent pattern using graph-mining
  • 26. Workflow Implementation: Fabienne Lambusch. Figure: Lambusch et al. (in preparation). Preprint: https://peerj.com/preprints/1479
  • 27. Reactions types found in BioModels Implementation: Fabienne Lambusch. Figure: Lambusch et al. (in preparation). Preprint: https://peerj.com/preprints/1479
  • 28. Identified motifs Implementation: Fabienne Lambusch. Figure: Lambusch et al. (in preparation). Preprint: https://peerj.com/preprints/1479
  • 29. Summary All code under public licenses: MASYMOS MORRE STON Pattern detection MOST (change statistics) M2CAT COMBINE Archive Web • Java based tools • Neo4J graph database • Parser for each format • Reuse of existing libraries / tools • jLibSBML • jSedML • Miriam Web Services (EBI) • Apache Commons • GSON • Owl-api • BiVeS-CellML
  • 30. Future work Future work Partners? - Incorporating health-related data to explore the behavior of models under varying health conditions - More applications for MASYMOS - Incorporating more ontologies and finding better similarity scores. - Reducing the conglomeration of tools.
  • 31. The team More @ https://sems.uni-rostock.de Left to right: Fabienne Lambusch, Martin Scharm, Dagmar Waltemath, Mariam Nassar, Tom Gebhardt, Martin Peters, Vasundra Touré, Ron Henkel
  • 32. Impact SEMS is part of a large systems biology community. Join us. It‘s fun. http://www.denbi.de http://co.mbine.org