SlideShare una empresa de Scribd logo
1 de 46
@monarchinit @ontowonka
“Not everyone can become a great
artist, but a great artist can come from
anywhere”
Anton Ego, Ratatouille, 2007, Dixsney/Pixar
Envisioning a world where everyone helps
solve disease
Melissa Haendel
SWAT4LS 2015
Cambridge, England
Faith-based research
“I believe that my work
on some obscure cell
type in some obscure
organism will matter to
mankind one day”
Well, it can, and it does.
Four things it takes to solve an
undiagnosed disease
1. Deep phenotyping the human organism
1. Crossing the language barrier
1. A lot of data from a lot of places
1. Very many people (who have faith)
1. DEEP PHENOTYPING THE
HUMAN ORGANISM
Patient
Genome
/Exome
Filter
****
** ***** ****
Genomic data
Diagnosis,
treatment
ATCTTAGCACGTTAC
ATCTTAGCACGTGAC
ATCTTATCACGTTAC
ATCTTAGCACGTTAC
What do all those variations do?
We only know the phenotypic consequences of mutation of
<20% of the human coding genome
Patient
Genome
/Exome
Diagnosis,
treatment
Filter
****
** ***** ****
Genomic data
Phenotyp
e
Gene-Phenotype
Data
Environment
We have a common language
for sequence data….
ATCTTAGCACGTTAC…
….not so much for phenotypes
CC2.0 European Southern Observatory
https://www.flickr.com/photos/esoastronomy/6923443595
Can we help machines understand
phenotypes?
“Palmoplantar
hyperkeratosis”
Human phenotype
I have
absolutely no
idea what that
means
Image credits:
"HandsEBS" by James Heilman, MD - Own work. Licensed under CC BY-SA 3.0 via Commons –
https://commons.wikimedia.org/wiki/File:HandsEBS.JPG#/media/File:HandsEBS.JPG
Marcin Wichary [CC BY 2.0 (http://creativecommons.org/licenses/by/2.0)], via Wikimedia Commons
A disease is a collection of
phenotypes
Patient
Disease X
Differential diagnosis with similar but non-matching phenotypes is difficult
Flat back of head Hypotonia
Abnormal skull morphology Decreased muscle mass
Do we *really* need yet another clinical
vocabulary?
Winnenburg and Bodenreider, ISMB PhenoDay, 2014
UMLS
SNOMED CT
CHV
MedDRA
MeSH
NCIT
ICD10-C
ICD9-CM
ICD-10
OMIM
MedlinePlus
Existing clinical vocabularies don’t adequately cover phenotype descriptions
Disease-phenotype associations using an
ontology
Hyposmia
Abnormality of
globe location
eyeball of
camera-type eye
sensory
perception of smell
Abnormal eye
morphology
Motor neuron
atrophyDeeply set eyes
motor neuronCL
34571 annotations in
22 species
157534 phenotype
annotations
2150 phenotype
annotations
Once OMIM is rendered
computable, are we done yet?
Free text -> HPO
enables phenotype semantic
similarity matching
Mendelian disease integration
Merges sources together using:
 equivalence and subclass axioms derived from xrefs
 string matching
 manual efforts to fill gaps based on phenotypes and anatomical
axioms
Parkinson’s disease
subtypes
Different colors =
different disease
sources
https://github.com/monarch-initiative/monarch-disease-ontology
Why we need all the organisms
Model data can provide up to 80% phenotypic coverage
of the human coding genome
We learn different things from different organisms
2. CROSSING THE LANGUAGE
BARRIER
Ulcerated
paws
Palmoplantar
hyperkeratosis
Thick hand skin
Image credits:
"HandsEBS" by James Heilman, MD - Own work. Licensed under CC BY-SA 3.0 via Commons –
https://commons.wikimedia.org/wiki/File:HandsEBS.JPG#/media/File:HandsEBS.JPG
http://www.guinealynx.info/pododermatitis.html
Challenge: Each database uses
their own vocabulary/ontology
MP
HP
MGI
HPOA
Image credits:
"HandsEBS" by James Heilman, MD - Own work. Licensed under CC BY-SA 3.0 via Commons –
https://commons.wikimedia.org/wiki/File:HandsEBS.JPG#/media/File:HandsEBS.JPG
http://www.guinealynx.info/pododermatitis.html
Challenge: Each database uses
their own vocabulary/ontology
ZFA
MP
DPO
WPO
HP
OMIA
VT
FYPO
APO
SNO
MED
…
…
…
WB
PB
FB
OMIA
MGI
RGD
ZFIN
SGD
HPOA
IMPC
OMIM
ICD
QTLdb
EHR
Image credits:
"HandsEBS" by James Heilman, MD - Own work. Licensed under CC BY-SA 3.0 via Commons –
https://commons.wikimedia.org/wiki/File:HandsEBS.JPG#/media/File:HandsEBS.JPG
http://www.guinealynx.info/pododermatitis.html
Decomposition of complex
concepts allows interoperability
Mungall, C. J., Gkoutos, G., Smith, C., Haendel, M., Lewis, S., & Ashburner, M.
(2010). Integrating phenotype ontologies across multiple species. Genome
Biology, 11(1), R2. doi:10.1186/gb-2010-11-1-r2
“Palmoplantar
hyperkeratosis”
increased
Stratum corneum
layer of skin
=
Human phenotype
PATO
Uberon
Species neutral ontologies, homologous concepts
Autopod
keratinization
GO
Cross-species ontology integration
3. A LOT OF DATA FROM A LOT
OF PLACES
Graph Views
Diverse
G2P/D
source data
Source
Ontologies Owl Loader
Graph
Views
Monarch App
Faceted
Browsing
Phenotype
Matching
.ttl
.ttl
Input OutputPipeline
Putting it Together:
Data + Ontologies
https://github.com/SciGraph/SciGraph
Data Integrated in SciGraph
>25 sources
>100 species
51M triples
4M curated
associations
2.2M G-P / G-D
associations
Genotype-phenotype integration
One source
Two sources
3 or more
9%
91% of our 2.2 Million G2P associations required
integrating 2 or more data sources
(this number does not even include orthology (Panther))
91%
Ontology-based phenotype matching
www.owlsim.org
Combining genotype and phenotype
data for variant prioritization
Whole exome
Remove off-target and
common variants
Variant score from allele
freq and pathogenicity
Phenotype score from phenotypic similarity
PHIVE score to give final candidates
Mendelian filters
https://www.sanger.ac.uk/reso
urces/software/exomiser/
York platelet syndrome and STIM1
Markello T et al. Molecular Genetics and Metabolism 2015, 114: 474 Grosse J, J Clin Invest 2007 117: 3540-50
Impaired platelet aggregation
(HP:0003540)
Thromocytopenia (HP:0001873)
Abnormal platelet activation
(MP:0006298)
Thrombocytopenia (MP:0003179)
UDP_2542 Stim1Sax/Sax
http://www.nature.com/gim/journal/vaop/ncurrent/full/gim2015137a.html
4. VERY MANY PEOPLE
(WHO HAVE FAITH)
Who helped solve the STIM1
UDP_2542 case?
Credit extends beyond the
publication
 Johannes creates stim1 mouse
 Melissa annotates patient UDP_2542 with HPO
 Will performs analysis of UDP_2542 that includes
stim1 mouse to generate a dataset of
prioritized variants
 Tom writes publication pmid:25577287 about the
STIM1 diagnosis
 Tom explicitly credits Will as an author but not
Melissa.
Credit is connected
Credit to Will is asserted, but credit to Melissa can be inferred
Who is in the graph?
Melissa Haendel
Peter Robinson
Chris Mungall
Sebastian Kohler
Cindy Smith
Nicole Vasilevsky
Sandra Dolken
Johannes Grosse
Attila Braun
David Varga-Szabo
Niklas Beyersdorf
Boris Schneider
Lutz Zeitlmann
Petra Hanke
Patricia Schropp
Silke Mühlstedt
Carolin Zorn
Michael Huber
Carolin Schmittwolf
Wolfgang Jagla
Philipp Yu
Thomas Kerkau
Harald Schulze
Michael Nehls
Bernhard Nieswandt
Thomas Markello
Dong Chen
Justin Y. Kwan
Iren Horkayne-Szakaly
Alan Morrison
Olga Simakova
Irina Maric
Jay Lozier
Andrew R. Cullinane
Tatjana Kilo
Lynn Meister
Kourosh Pakzad
Sanjay Chainani
Roxanne Fischer
Camilo Toro
James G. White
David Adams
Cornelius Boerkoel
William A. Gahl
Cynthia J. Tifft
Meral Gunay-Aygun
Melissa Haendel
David Adams
David Draper
Bailey Gallinger
Joie Davis
Nicole Vasilevsky
Heather Trang
Rena Godfrey
Gretchen Golas
Catherine Groden
Michele Nehrebecky
Ariane Soldatos
Elise Valkanas,
Colleen Wahl
Lynne Wolfe
Elizabeth Lee
Amanda Links
Will Bone
Murat Sincan
Damian Smedley
Jules Jacobson
Nicole Washington
Elise Flynn
Sebastian Kohler
Orion Buske
Marta Girdea
Michael Brudno
Jeremy Band
Hans Goeble
Karen Balbach
Nadine Pfeifer
Sandra Werner
Christian Linden
Clinical/care Pathology Ontologist CS/informatics Curator Basic research
Tracking Evidence and Provenance
of G2P Associations
Evidence is a collection of information that is used
to support a scientific claim or association
Provenance is a history of what processes led to
the claim being made, what entities participated in
these processes
Value of Evidence and Provenance Metadata
 context to evaluate credibility/confidence
 support filtering and analysis of data
 detailed history for attribution
Evidence and Provenance for a
Variant-Phenotype Association
Who is missing?
http://haluzz.deviantart.com/art/Waldo-at-the-hipster-party-273602450
What about patients?
Can they help too?
HP:0000252
Pref Label: Microcephaly
Synonyms: Decreased Head Circumference;
Reduced Head Circumference; Small head
circumference
Suggested Synonyms : Small Head; Little Head;
Small Skull; Little Skull; Small Cranium…
Small headMicrocephaly
https://commons.wikimedia.org/wiki/File:Microcephaly.png#/media/File:Microcephaly.png
Job opening
https://goo.gl/MlcnR5
Focusing on building ontologies and
semantic web technologies to
represent research, attribution,
provenance, and scholarly
communication
@ontowonka haendel@ohsu.edu
Funding: NIH Office of Director: 1R24OD011883; NIH-UDP:
HHSN268201300036C, HHSN268201400093P; NCINCI/Leidos #15X143,
BD2K U54HG007990-S2 (Haussler) & BD2K PA-15-144-U01 (Kesselman)
PIs: Chris Mungall, Peter Robinson, Damian Smedley, Tudor Groza, Harry Hochheiser
www.monarchinitiative.org/page/team

Más contenido relacionado

La actualidad más candente

Single-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and ChallengesSingle-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and Challenges
inside-BigData.com
 
Guided visual exploration of patient stratifications in cancer genomics
Guided visual exploration of patient stratifications in cancer genomicsGuided visual exploration of patient stratifications in cancer genomics
Guided visual exploration of patient stratifications in cancer genomics
Nils Gehlenborg
 
Building (and traveling) the data-brick road: A report from the front lines ...
Building (and traveling) the data-brick road:  A report from the front lines ...Building (and traveling) the data-brick road:  A report from the front lines ...
Building (and traveling) the data-brick road: A report from the front lines ...
mhaendel
 
Equivalence is in the (ID) of the beholder
Equivalence is in the (ID) of the beholderEquivalence is in the (ID) of the beholder
Equivalence is in the (ID) of the beholder
mhaendel
 

La actualidad más candente (20)

The Application of the Human Phenotype Ontology
The Application of the Human Phenotype Ontology The Application of the Human Phenotype Ontology
The Application of the Human Phenotype Ontology
 
GA4GH Phenotype Ontologies Task team update
GA4GH Phenotype Ontologies Task team updateGA4GH Phenotype Ontologies Task team update
GA4GH Phenotype Ontologies Task team update
 
Deep phenotyping for everyone
Deep phenotyping for everyoneDeep phenotyping for everyone
Deep phenotyping for everyone
 
The Monarch Initiative: An integrated genotype-phenotype platform for disease...
The Monarch Initiative: An integrated genotype-phenotype platform for disease...The Monarch Initiative: An integrated genotype-phenotype platform for disease...
The Monarch Initiative: An integrated genotype-phenotype platform for disease...
 
GA4GH Monarch Driver Project Introduction
GA4GH Monarch Driver Project IntroductionGA4GH Monarch Driver Project Introduction
GA4GH Monarch Driver Project Introduction
 
Deep phenotyping to aid identification of coding & non-coding rare disease v...
Deep phenotyping to aid identification  of coding & non-coding rare disease v...Deep phenotyping to aid identification  of coding & non-coding rare disease v...
Deep phenotyping to aid identification of coding & non-coding rare disease v...
 
Semantics for rare disease phenotyping, diagnostics, and discovery
Semantics for rare disease phenotyping, diagnostics, and discoverySemantics for rare disease phenotyping, diagnostics, and discovery
Semantics for rare disease phenotyping, diagnostics, and discovery
 
Patient-led deep phenotyping using a lay-friendly version of the Human Phenot...
Patient-led deep phenotyping using a lay-friendly version of the Human Phenot...Patient-led deep phenotyping using a lay-friendly version of the Human Phenot...
Patient-led deep phenotyping using a lay-friendly version of the Human Phenot...
 
Use of semantic phenotyping to aid disease diagnosis
Use of semantic phenotyping to aid disease diagnosisUse of semantic phenotyping to aid disease diagnosis
Use of semantic phenotyping to aid disease diagnosis
 
Integrating clinical and model organism G2P data for disease discovery
Integrating clinical and model organism G2P data for disease discoveryIntegrating clinical and model organism G2P data for disease discovery
Integrating clinical and model organism G2P data for disease discovery
 
Single-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and ChallengesSingle-Cell Sequencing for Drug Discovery: Applications and Challenges
Single-Cell Sequencing for Drug Discovery: Applications and Challenges
 
2015 functional genomics variant annotation and interpretation- tools and p...
2015 functional genomics   variant annotation and interpretation- tools and p...2015 functional genomics   variant annotation and interpretation- tools and p...
2015 functional genomics variant annotation and interpretation- tools and p...
 
2015 TriCon - Clinical Grade Annotations - Public Data Resources for Interpre...
2015 TriCon - Clinical Grade Annotations - Public Data Resources for Interpre...2015 TriCon - Clinical Grade Annotations - Public Data Resources for Interpre...
2015 TriCon - Clinical Grade Annotations - Public Data Resources for Interpre...
 
Guided visual exploration of patient stratifications in cancer genomics
Guided visual exploration of patient stratifications in cancer genomicsGuided visual exploration of patient stratifications in cancer genomics
Guided visual exploration of patient stratifications in cancer genomics
 
Visual Exploration of Clinical and Genomic Data for Patient Stratification
Visual Exploration of Clinical and Genomic Data for Patient StratificationVisual Exploration of Clinical and Genomic Data for Patient Stratification
Visual Exploration of Clinical and Genomic Data for Patient Stratification
 
Building (and traveling) the data-brick road: A report from the front lines ...
Building (and traveling) the data-brick road:  A report from the front lines ...Building (and traveling) the data-brick road:  A report from the front lines ...
Building (and traveling) the data-brick road: A report from the front lines ...
 
Psb tutorial cancer_pathways
Psb tutorial cancer_pathwaysPsb tutorial cancer_pathways
Psb tutorial cancer_pathways
 
Equivalence is in the (ID) of the beholder
Equivalence is in the (ID) of the beholderEquivalence is in the (ID) of the beholder
Equivalence is in the (ID) of the beholder
 
Platforms CIBERER and INB-ELIXIR-es
Platforms CIBERER and INB-ELIXIR-esPlatforms CIBERER and INB-ELIXIR-es
Platforms CIBERER and INB-ELIXIR-es
 
A New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The ClinicA New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The Clinic
 

Similar a Envisioning a world where everyone helps solve disease

Haendel clingenetics.3.14.14
Haendel clingenetics.3.14.14Haendel clingenetics.3.14.14
Haendel clingenetics.3.14.14
mhaendel
 
Fundamentals of Analysis of Exomes
Fundamentals of Analysis of ExomesFundamentals of Analysis of Exomes
Fundamentals of Analysis of Exomes
daforerog
 
A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomi...
A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomi...A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomi...
A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomi...
Antoaneta Vladimirova
 
provenance of microarray experiments
provenance of microarray experimentsprovenance of microarray experiments
provenance of microarray experiments
Helena Deus
 
Phenotypes and models portal at the rat genome database
Phenotypes and models portal at the rat genome databasePhenotypes and models portal at the rat genome database
Phenotypes and models portal at the rat genome database
Jennifer Smith
 

Similar a Envisioning a world where everyone helps solve disease (20)

Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015
 
The Foundation of P4 Medicine
The Foundation of P4 MedicineThe Foundation of P4 Medicine
The Foundation of P4 Medicine
 
Haendel clingenetics.3.14.14
Haendel clingenetics.3.14.14Haendel clingenetics.3.14.14
Haendel clingenetics.3.14.14
 
How to transform genomic big data into valuable clinical information
How to transform genomic big data into valuable clinical informationHow to transform genomic big data into valuable clinical information
How to transform genomic big data into valuable clinical information
 
Fundamentals of Analysis of Exomes
Fundamentals of Analysis of ExomesFundamentals of Analysis of Exomes
Fundamentals of Analysis of Exomes
 
Why Life is Difficult, and What We MIght Do About It
Why Life is Difficult, and What We MIght Do About ItWhy Life is Difficult, and What We MIght Do About It
Why Life is Difficult, and What We MIght Do About It
 
Informatics and data analytics to support for exposome-based discovery
Informatics and data analytics to support for exposome-based discoveryInformatics and data analytics to support for exposome-based discovery
Informatics and data analytics to support for exposome-based discovery
 
A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomi...
A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomi...A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomi...
A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomi...
 
Gellibolian 2010 Audio Visual2
Gellibolian 2010 Audio Visual2Gellibolian 2010 Audio Visual2
Gellibolian 2010 Audio Visual2
 
The Monarch Initiative Phenotype Grid
The Monarch Initiative Phenotype GridThe Monarch Initiative Phenotype Grid
The Monarch Initiative Phenotype Grid
 
Supporting Genomics in the Practice of Medicine by Heidi Rehm
Supporting Genomics in the Practice of Medicine by Heidi RehmSupporting Genomics in the Practice of Medicine by Heidi Rehm
Supporting Genomics in the Practice of Medicine by Heidi Rehm
 
Montgomery expression
Montgomery expressionMontgomery expression
Montgomery expression
 
 
Repurposing large datasets for exposomic discovery in disease
Repurposing large datasets for exposomic discovery in diseaseRepurposing large datasets for exposomic discovery in disease
Repurposing large datasets for exposomic discovery in disease
 
provenance of microarray experiments
provenance of microarray experimentsprovenance of microarray experiments
provenance of microarray experiments
 
Human Disease Ontology Project presented at ISB's Biocurator meeting April 2014
Human Disease Ontology Project presented at ISB's Biocurator meeting April 2014Human Disease Ontology Project presented at ISB's Biocurator meeting April 2014
Human Disease Ontology Project presented at ISB's Biocurator meeting April 2014
 
TLSC Biotech 101 Noc 2010 (Moore)
TLSC Biotech 101 Noc 2010 (Moore)TLSC Biotech 101 Noc 2010 (Moore)
TLSC Biotech 101 Noc 2010 (Moore)
 
Phenotypes and models portal at the rat genome database
Phenotypes and models portal at the rat genome databasePhenotypes and models portal at the rat genome database
Phenotypes and models portal at the rat genome database
 
Biomarkers brain regions
Biomarkers brain regionsBiomarkers brain regions
Biomarkers brain regions
 
BIOINFORMATICS Applications And Challenges
BIOINFORMATICS Applications And ChallengesBIOINFORMATICS Applications And Challenges
BIOINFORMATICS Applications And Challenges
 

Más de mhaendel

Reusable data for biomedicine: A data licensing odyssey
Reusable data for biomedicine:  A data licensing odysseyReusable data for biomedicine:  A data licensing odyssey
Reusable data for biomedicine: A data licensing odyssey
mhaendel
 
How open is open? An evaluation rubric for public knowledgebases
How open is open?  An evaluation rubric for public knowledgebasesHow open is open?  An evaluation rubric for public knowledgebases
How open is open? An evaluation rubric for public knowledgebases
mhaendel
 

Más de mhaendel (12)

The Software and Data Licensing Solution: Not Your Dad’s UBMTA
The Software and Data Licensing Solution: Not Your Dad’s UBMTA The Software and Data Licensing Solution: Not Your Dad’s UBMTA
The Software and Data Licensing Solution: Not Your Dad’s UBMTA
 
Reusable data for biomedicine: A data licensing odyssey
Reusable data for biomedicine:  A data licensing odysseyReusable data for biomedicine:  A data licensing odyssey
Reusable data for biomedicine: A data licensing odyssey
 
Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery
Data Translator: an Open Science Data Platform for Mechanistic Disease DiscoveryData Translator: an Open Science Data Platform for Mechanistic Disease Discovery
Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery
 
How open is open? An evaluation rubric for public knowledgebases
How open is open?  An evaluation rubric for public knowledgebasesHow open is open?  An evaluation rubric for public knowledgebases
How open is open? An evaluation rubric for public knowledgebases
 
Science in the open, what does it take?
Science in the open, what does it take?Science in the open, what does it take?
Science in the open, what does it take?
 
Credit where credit is due: acknowledging all types of contributions
Credit where credit is due: acknowledging all types of contributionsCredit where credit is due: acknowledging all types of contributions
Credit where credit is due: acknowledging all types of contributions
 
Getting (and giving) credit for all that we do
Getting (and giving) credit for all that we doGetting (and giving) credit for all that we do
Getting (and giving) credit for all that we do
 
Force11: Enabling transparency and efficiency in the research landscape
Force11: Enabling transparency and efficiency in the research landscapeForce11: Enabling transparency and efficiency in the research landscape
Force11: Enabling transparency and efficiency in the research landscape
 
Semantic phenotyping for disease diagnosis and discovery
Semantic phenotyping for disease diagnosis and discovery Semantic phenotyping for disease diagnosis and discovery
Semantic phenotyping for disease diagnosis and discovery
 
Dataset description using the W3C HCLS standard
Dataset description using the W3C HCLS standardDataset description using the W3C HCLS standard
Dataset description using the W3C HCLS standard
 
On the nature of Credit
On the nature of CreditOn the nature of Credit
On the nature of Credit
 
Standardizing scholarly output with the VIVO ontology
Standardizing scholarly output with the VIVO ontologyStandardizing scholarly output with the VIVO ontology
Standardizing scholarly output with the VIVO ontology
 

Último

Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
gindu3009
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
PirithiRaju
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
Sérgio Sacani
 

Último (20)

COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
Clean In Place(CIP).pptx .
Clean In Place(CIP).pptx                 .Clean In Place(CIP).pptx                 .
Clean In Place(CIP).pptx .
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)GBSN - Microbiology (Unit 3)
GBSN - Microbiology (Unit 3)
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Presentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptxPresentation Vikram Lander by Vedansh Gupta.pptx
Presentation Vikram Lander by Vedansh Gupta.pptx
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)GBSN - Microbiology (Unit 2)
GBSN - Microbiology (Unit 2)
 
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Zoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdfZoology 4th semester series (krishna).pdf
Zoology 4th semester series (krishna).pdf
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 

Envisioning a world where everyone helps solve disease

  • 1. @monarchinit @ontowonka “Not everyone can become a great artist, but a great artist can come from anywhere” Anton Ego, Ratatouille, 2007, Dixsney/Pixar Envisioning a world where everyone helps solve disease Melissa Haendel SWAT4LS 2015 Cambridge, England
  • 2. Faith-based research “I believe that my work on some obscure cell type in some obscure organism will matter to mankind one day” Well, it can, and it does.
  • 3.
  • 4. Four things it takes to solve an undiagnosed disease 1. Deep phenotyping the human organism 1. Crossing the language barrier 1. A lot of data from a lot of places 1. Very many people (who have faith)
  • 5. 1. DEEP PHENOTYPING THE HUMAN ORGANISM
  • 6. Patient Genome /Exome Filter **** ** ***** **** Genomic data Diagnosis, treatment ATCTTAGCACGTTAC ATCTTAGCACGTGAC ATCTTATCACGTTAC ATCTTAGCACGTTAC
  • 7. What do all those variations do? We only know the phenotypic consequences of mutation of <20% of the human coding genome
  • 9. We have a common language for sequence data…. ATCTTAGCACGTTAC… ….not so much for phenotypes
  • 10. CC2.0 European Southern Observatory https://www.flickr.com/photos/esoastronomy/6923443595
  • 11. Can we help machines understand phenotypes? “Palmoplantar hyperkeratosis” Human phenotype I have absolutely no idea what that means Image credits: "HandsEBS" by James Heilman, MD - Own work. Licensed under CC BY-SA 3.0 via Commons – https://commons.wikimedia.org/wiki/File:HandsEBS.JPG#/media/File:HandsEBS.JPG Marcin Wichary [CC BY 2.0 (http://creativecommons.org/licenses/by/2.0)], via Wikimedia Commons
  • 12. A disease is a collection of phenotypes Patient Disease X Differential diagnosis with similar but non-matching phenotypes is difficult Flat back of head Hypotonia Abnormal skull morphology Decreased muscle mass
  • 13. Do we *really* need yet another clinical vocabulary? Winnenburg and Bodenreider, ISMB PhenoDay, 2014 UMLS SNOMED CT CHV MedDRA MeSH NCIT ICD10-C ICD9-CM ICD-10 OMIM MedlinePlus Existing clinical vocabularies don’t adequately cover phenotype descriptions
  • 14. Disease-phenotype associations using an ontology Hyposmia Abnormality of globe location eyeball of camera-type eye sensory perception of smell Abnormal eye morphology Motor neuron atrophyDeeply set eyes motor neuronCL 34571 annotations in 22 species 157534 phenotype annotations 2150 phenotype annotations
  • 15. Once OMIM is rendered computable, are we done yet? Free text -> HPO enables phenotype semantic similarity matching
  • 16. Mendelian disease integration Merges sources together using:  equivalence and subclass axioms derived from xrefs  string matching  manual efforts to fill gaps based on phenotypes and anatomical axioms Parkinson’s disease subtypes Different colors = different disease sources https://github.com/monarch-initiative/monarch-disease-ontology
  • 17. Why we need all the organisms Model data can provide up to 80% phenotypic coverage of the human coding genome
  • 18. We learn different things from different organisms
  • 19. 2. CROSSING THE LANGUAGE BARRIER
  • 20. Ulcerated paws Palmoplantar hyperkeratosis Thick hand skin Image credits: "HandsEBS" by James Heilman, MD - Own work. Licensed under CC BY-SA 3.0 via Commons – https://commons.wikimedia.org/wiki/File:HandsEBS.JPG#/media/File:HandsEBS.JPG http://www.guinealynx.info/pododermatitis.html
  • 21. Challenge: Each database uses their own vocabulary/ontology MP HP MGI HPOA Image credits: "HandsEBS" by James Heilman, MD - Own work. Licensed under CC BY-SA 3.0 via Commons – https://commons.wikimedia.org/wiki/File:HandsEBS.JPG#/media/File:HandsEBS.JPG http://www.guinealynx.info/pododermatitis.html
  • 22. Challenge: Each database uses their own vocabulary/ontology ZFA MP DPO WPO HP OMIA VT FYPO APO SNO MED … … … WB PB FB OMIA MGI RGD ZFIN SGD HPOA IMPC OMIM ICD QTLdb EHR Image credits: "HandsEBS" by James Heilman, MD - Own work. Licensed under CC BY-SA 3.0 via Commons – https://commons.wikimedia.org/wiki/File:HandsEBS.JPG#/media/File:HandsEBS.JPG http://www.guinealynx.info/pododermatitis.html
  • 23. Decomposition of complex concepts allows interoperability Mungall, C. J., Gkoutos, G., Smith, C., Haendel, M., Lewis, S., & Ashburner, M. (2010). Integrating phenotype ontologies across multiple species. Genome Biology, 11(1), R2. doi:10.1186/gb-2010-11-1-r2 “Palmoplantar hyperkeratosis” increased Stratum corneum layer of skin = Human phenotype PATO Uberon Species neutral ontologies, homologous concepts Autopod keratinization GO
  • 25. 3. A LOT OF DATA FROM A LOT OF PLACES
  • 26. Graph Views Diverse G2P/D source data Source Ontologies Owl Loader Graph Views Monarch App Faceted Browsing Phenotype Matching .ttl .ttl Input OutputPipeline Putting it Together: Data + Ontologies https://github.com/SciGraph/SciGraph
  • 27. Data Integrated in SciGraph >25 sources >100 species 51M triples 4M curated associations 2.2M G-P / G-D associations
  • 28. Genotype-phenotype integration One source Two sources 3 or more 9% 91% of our 2.2 Million G2P associations required integrating 2 or more data sources (this number does not even include orthology (Panther)) 91%
  • 30. Combining genotype and phenotype data for variant prioritization Whole exome Remove off-target and common variants Variant score from allele freq and pathogenicity Phenotype score from phenotypic similarity PHIVE score to give final candidates Mendelian filters https://www.sanger.ac.uk/reso urces/software/exomiser/
  • 31. York platelet syndrome and STIM1 Markello T et al. Molecular Genetics and Metabolism 2015, 114: 474 Grosse J, J Clin Invest 2007 117: 3540-50 Impaired platelet aggregation (HP:0003540) Thromocytopenia (HP:0001873) Abnormal platelet activation (MP:0006298) Thrombocytopenia (MP:0003179) UDP_2542 Stim1Sax/Sax http://www.nature.com/gim/journal/vaop/ncurrent/full/gim2015137a.html
  • 32. 4. VERY MANY PEOPLE (WHO HAVE FAITH)
  • 33. Who helped solve the STIM1 UDP_2542 case?
  • 34. Credit extends beyond the publication  Johannes creates stim1 mouse  Melissa annotates patient UDP_2542 with HPO  Will performs analysis of UDP_2542 that includes stim1 mouse to generate a dataset of prioritized variants  Tom writes publication pmid:25577287 about the STIM1 diagnosis  Tom explicitly credits Will as an author but not Melissa.
  • 35. Credit is connected Credit to Will is asserted, but credit to Melissa can be inferred
  • 36.
  • 37.
  • 38.
  • 39.
  • 40. Who is in the graph? Melissa Haendel Peter Robinson Chris Mungall Sebastian Kohler Cindy Smith Nicole Vasilevsky Sandra Dolken Johannes Grosse Attila Braun David Varga-Szabo Niklas Beyersdorf Boris Schneider Lutz Zeitlmann Petra Hanke Patricia Schropp Silke Mühlstedt Carolin Zorn Michael Huber Carolin Schmittwolf Wolfgang Jagla Philipp Yu Thomas Kerkau Harald Schulze Michael Nehls Bernhard Nieswandt Thomas Markello Dong Chen Justin Y. Kwan Iren Horkayne-Szakaly Alan Morrison Olga Simakova Irina Maric Jay Lozier Andrew R. Cullinane Tatjana Kilo Lynn Meister Kourosh Pakzad Sanjay Chainani Roxanne Fischer Camilo Toro James G. White David Adams Cornelius Boerkoel William A. Gahl Cynthia J. Tifft Meral Gunay-Aygun Melissa Haendel David Adams David Draper Bailey Gallinger Joie Davis Nicole Vasilevsky Heather Trang Rena Godfrey Gretchen Golas Catherine Groden Michele Nehrebecky Ariane Soldatos Elise Valkanas, Colleen Wahl Lynne Wolfe Elizabeth Lee Amanda Links Will Bone Murat Sincan Damian Smedley Jules Jacobson Nicole Washington Elise Flynn Sebastian Kohler Orion Buske Marta Girdea Michael Brudno Jeremy Band Hans Goeble Karen Balbach Nadine Pfeifer Sandra Werner Christian Linden Clinical/care Pathology Ontologist CS/informatics Curator Basic research
  • 41. Tracking Evidence and Provenance of G2P Associations Evidence is a collection of information that is used to support a scientific claim or association Provenance is a history of what processes led to the claim being made, what entities participated in these processes Value of Evidence and Provenance Metadata  context to evaluate credibility/confidence  support filtering and analysis of data  detailed history for attribution
  • 42. Evidence and Provenance for a Variant-Phenotype Association
  • 44. What about patients? Can they help too? HP:0000252 Pref Label: Microcephaly Synonyms: Decreased Head Circumference; Reduced Head Circumference; Small head circumference Suggested Synonyms : Small Head; Little Head; Small Skull; Little Skull; Small Cranium… Small headMicrocephaly https://commons.wikimedia.org/wiki/File:Microcephaly.png#/media/File:Microcephaly.png
  • 45. Job opening https://goo.gl/MlcnR5 Focusing on building ontologies and semantic web technologies to represent research, attribution, provenance, and scholarly communication @ontowonka haendel@ohsu.edu
  • 46. Funding: NIH Office of Director: 1R24OD011883; NIH-UDP: HHSN268201300036C, HHSN268201400093P; NCINCI/Leidos #15X143, BD2K U54HG007990-S2 (Haussler) & BD2K PA-15-144-U01 (Kesselman) PIs: Chris Mungall, Peter Robinson, Damian Smedley, Tudor Groza, Harry Hochheiser www.monarchinitiative.org/page/team

Notas del editor

  1. Not sure about origin of this image
  2. We understand central hypothesis DNA RNA Protein  building blocks We’ve found reliable methods to describe and move genetic information around with computers. that we can see/ assess phenotype, But how do you computationally describe it ? Massive amounts of genetic data must also be able to be aligned with a phenotype – in a way that a machine can reason and infer an undiagnosed genetic patient having several phenotypes (asymmetry of face, temporal bulging, café au lait on neck, asymmetric smile/ facial animation, uneven eyes.
  3. The standard genomic paradigm. Based on statistical properties like distributions of variations in the genome in humans
  4. There is a lot we don’t know about the genome
  5. Adding phenotype
  6. Sorry Star Trek, you had to go for posting.
  7. Our approach is to try and get the machine to understand the terms so that it can assist us intelligently.
  8. Can get rid of this slide?
  9. Represent organism as a biological subject Represent diseases/genotypes as collections of nodes in the graph 3. Interoperable with other bioinformatics resources and leverage modern semantic standards
  10. yellow = orphanet brown= omim blue = disease ontology pink = monarch/MGI disease-clusters Gray- MESH
  11. Data from mouse, rat, zebrafish, worm, fruitfly
  12. Highlighting how we get different phenotypic information from different sources, species Data from MGI, ZFIN, & HPO, reasoned over with cross-species phenotype ontology https://code.google.com/p/phenotype-ontologies/ The distribution of phenotype information per model genotype is different compared to human disease annotations. For mouse, there’s a much higher representation of metabolic, cardiovascular, blood, and endocrine phenotypes available to compare; For fish, there’s increased nervous, skeletal, head and neck, and cardiovascular, and connective tissue. (Note that these do not include “normal” phenotypes for either diseases or genotypes.) What does it mean to replicate a phenotypic profile in a model organism? For many patients or diseases, we may need different models to fully recapitulate the disease. Further, some phenotypes are common in a given species and if present in the patient, would be a less significant result.
  13. 2 issues: database integration, vocabulary integration
  14. Multiple databases, each with their own vocabulary; these images are of questionable licensing and origin
  15. We make things digestible. Complex concepts into simpler parts. We use ontologies that are comparative by design.
  16. We can match in “fuzzy” ways by making semantic associations, and leveraging underlying logic, such as anatomy These images are not licensed and I don’t even know where they came from
  17. This was the novel case we solved. The UDP patient had a number of signs and symptoms including various platelet abnormalities. The same heterozygous, missense mutation was seen in 2 patients and ranked top by Exomiser. It had never been seen in any of the SNP databases and was predicted maximally pathogenic. Finally a mouse curated by MGI involving a heterozygous, missense point mutation introduced by chemical mutagenesis exhibited strikingly similar platelet abnormalities.
  18. This image is public domain https://pixabay.com/en/detective-male-man-profile-156465/
  19. Not sure what the license is on this thing..
  20. Have requested the rights for originally presented picture. Here is a similar one with the following attribution: "Microcephaly" by Unknown - (2004) Evolutionary History of a Gene Controlling Brain Size. PLoS Biol 2(5): e134. doi:10.1371/journal.pbio.0020134. Licensed under CC BY 2.5 via Commons - https://commons.wikimedia.org/wiki/File:Microcephaly.png#/media/File:Microcephaly.png