The development and application of high-throughput technologies in biology leads to a rapid increase of data and knowledge and enables the possibility for a paradigm shift towards the personalized treatment of disease based on an individual patient’s genetic markup. Major challenges that biology faces today are to integrate data across different databases, domains, levels of granularity and species, and to make the information resulting from high-throughput experiments amenable to scientific analyses and the discovery of mechanisms underlying disease. In my talk, I will demonstrate how formal ontologies combined with recent progress in automated reasoning can be used to represent, integrate and analyze data resulting from high-throughput phenotyping experiments. I will show how an expressive formal representation of phenotype ontologies can lead to interoperability with biomedical ontologies of other domains, illustrate an ontology modularization approach that enables the use of automated reasoning over these ontologies and show how to integrate phenotype data across multiple species. Finally, I will demonstrate how measures of semantic similarity can be applied to analyze high-throughput phenotype data and reveal novel gene-disease associations and discuss how an ontology-based approach to the semantic integration of data in biomedicine can facilitate translational research and personalized medicine.
The Codex of Business Writing Software for Real-World Solutions 2.pptx
Ontologies for representing, integrating and analyzing phenotypes
1. Ontologies for representing, integrating and analyzing
phenotypes
Robert Hoehndorf
Department of Genetics
University of Cambridge
21 June 2011
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 1 / 40
2. Introduction Motivation
Motivation
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 2 / 40
3. Introduction Motivation
Motivation
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 3 / 40
4. Introduction Ontology
Open Biomedical Ontologies (OBO)
Individual
Physical object Quality Function Process
ChEBI Ontology Molecule
Gene
Sequence Ontology
Transcript
GO-CC Organelle
Celltype Gene Ontology Cell
Phenotype Tissue
Ontology Organ
Anatomy
Ontology
Body
Population
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 4 / 40
5. Introduction Ontology
Ontology
Phenotype and anatomy ontologies
anatomy ontologies: > 100,000 classes
FMA, MA, WA, ZFA, FA, GO-CC, ...
phenotype ontologies: > 20,000 classes
HPO, MP, WBPhenotype, FBcv, APO, ...
quality ontology: > 2,000 classes
PATO
process and function ontologies: > 25,000 classes
Gene Ontology, ...
alignments between anatomy ontologies
UBERON, various mappings
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 5 / 40
6. Introduction Ontology
Ontology
Challenges for interoperability
“merely using ontologies [...] does not reduce heterogeneity: it
just raises heterogeneity problems to a higher level” [Euzenat,
2007]
implicit knowledge
implicit semantics
weakly formalized
very large
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 6 / 40
7. Introduction Ontology
Ontology
Example query
Find all regions in the human and mouse genome sequences that are
associated with Tetralogy of Fallot.
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 7 / 40
8. Phenotype ontology Tetralogy of Fallot
Tetralogy of Fallot
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 8 / 40
9. Phenotype ontology Tetralogy of Fallot
Tetralogy of Fallot
Human phenotypes
Overriding aorta (HP:0002623)
Ventricular septal defect (HP:0001629)
Pulmonic stenosis (HP:0001642)
Right ventricular hypertrophy (HP:0001667)
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 9 / 40
10. Phenotype ontology Tetralogy of Fallot
Tetralogy of Fallot
Phenotype description syntax
Overriding aorta (HP:0002623):
Q: overlap with (PATO:0001590)
E1: Aorta (FMA:3734)
E2: Membranous part of interventricular septum (FMA:7135)
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 10 / 40
11. Phenotype ontology Tetralogy of Fallot
Tetralogy of Fallot
Phenotype description syntax
Overriding aorta (HP:0002623):
Q: overlap with (PATO:0001590)
E1: Aorta (FMA:3734)
E2: Membranous part of interventricular septum (FMA:7135)
HP:0002623 EquivalentTo:
phene-of some (has-part some (FMA:3734 and
has-quality some (PATO:0001590 and towards some
FMA:7135)))
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 11 / 40
12. Phenotype ontology Tetralogy of Fallot
Tetralogy of Fallot
Phenotype description syntax
Overriding aorta (HP:0002623):
Q: overlap with (PATO:0001590)
E1: Aorta (FMA:3734)
E2: Membranous part of interventricular septum (FMA:7135)
HP:0002623 EquivalentTo:
phene-of some (has-part some (FMA:3734 and
overlaps-with some FMA:7135))
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 12 / 40
13. Phenotype ontology Tetralogy of Fallot
Tetralogy of Fallot
UBERON human-mouse anatomy equivalences
Overriding aorta (HP:0002623):
Q: overlap with (PATO:0001590)
E1: Aorta (FMA:3734)
FMA:3734 EquivalentTo: MA:0000062
E2: Membranous part of interventricular septum (FMA:7135)
FMA:7135 EquivalentTo: MA:0002939
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 13 / 40
14. Phenotype ontology Tetralogy of Fallot
Tetralogy of Fallot
Phenotype equivalence
Overriding aorta (MP:0000273):
Q: overlap with (PATO:0001590)
E1: Aorta (MA:0000062)
E2: Membranous interventricular septum (MA:0002939)
MP:0000273 EquivalentTo:
phene-of some (has-part some (MA:0000062 and
has-quality some (PATO:0001590 and towards some
MA:0002939)))
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 14 / 40
15. Phenotype ontology Tetralogy of Fallot
Tetralogy of Fallot
Phenotype equivalence
Overriding aorta (MP:0000273):
Q: overlap with (PATO:0001590)
E1: Aorta (MA:0000062)
E2: Membranous interventricular septum (MA:0002939)
MP:0000273 EquivalentTo:
phene-of some (has-part some (MA:0000062 and
has-quality some (PATO:0001590 and towards some
MA:0002939)))
Consequence: MP:00000273 EquivalentTo: HP:0002623
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 14 / 40
16. Phenotype ontology Absence
Absence
Absent appendix
Absent appendix:
Q: lacks all parts of type (PATO:0002000)
E1: Human body (FMA:20394)
E2: Appendix (FMA:14542)
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 15 / 40
17. Phenotype ontology Absence
Absence
Absent appendix
Absent appendix:
Q: lacks all parts of type (PATO:0002000)
E1: Human body (FMA:20394)
E2: Appendix (FMA:14542)
AbsentAppendix ≡
LacksParts ∃towards.Appendix ∃inheresIn.HumanBody (Horrocks,
2007)
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 15 / 40
18. Phenotype ontology Absence
Absence
Absent appendix
Absent appendix:
Q: lacks all parts of type (PATO:0002000)
E1: Human body (FMA:20394)
E2: Appendix (FMA:14542)
AbsentAppendix ≡
LacksParts ∃towards.Appendix ∃inheresIn.HumanBody (Horrocks,
2007)
AbsentAppendix ≡
LacksParts ∃towards.{Appendix} ∃inheresIn.HumanBody
(Mungall, 2007)
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 15 / 40
19. Phenotype ontology Absence
Absence
Absent appendix
Absent appendix:
Q: lacks all parts of type (PATO:0002000)
E1: Human body (FMA:20394)
E2: Appendix (FMA:14542)
AbsentAppendix ≡
LacksParts ∃towards.Appendix ∃inheresIn.HumanBody (Horrocks,
2007)
AbsentAppendix ≡
LacksParts ∃towards.{Appendix} ∃inheresIn.HumanBody
(Mungall, 2007)
AbsentAppendix ∃pheneOf .(HumanBody ¬∃hasPart.Appendix)
(H et al., 2007, 2011)
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 15 / 40
21. Phenotype ontology Absence
Absence
Absent appendix
Removal of conflicting axioms (has-part/part-of in anatomy)
Contextualize anatomy:
Normal HumanBody ∃hasPart.(Normal Appendix)
Use of non-monotonic reasoning:
Normally: HumanBody ∃hasPart.Appendix
Circumscription of ¬Normal
Implementation in dlvhex
IC-has-part(X,Y) :- ind(X),class(Y),inst(X,Z),
CC-normally-has-part(Z,Y), not IC-lacks-has-part(X,Y),
class(Z).
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 17 / 40
22. Phenotype ontology Absence
Ontology of phenotypes
Different formal expressions for phenotypes based on
qualities,
anatomical parts,
functions,
processes
enable cross-species integration of phenotypes.
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 18 / 40
23. Phenotype ontology Discovering mouse models
Tetralogy of Fallot
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 19 / 40
24. Phenotype ontology Discovering mouse models
Phenotype alignments
Mouse model: Phc1
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 20 / 40
25. Phenotype ontology Discovering mouse models
Phenotype alignments
Tetralogy of Fallot: Phc1
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 21 / 40
26. Knowledge representation Modularization
Complexity of automated reasoning
ontologies based on OWL
OWL 2 is based on description logic (SROIQ)
satisfiability in SROIQ is 2NEXPTIME-complete
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 22 / 40
27. Knowledge representation Modularization
Modularization
tractable subsets of OWL 2: EL, QL, RL
problem: identify a large (EL, QL, RL)-module of an OWL ontology
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 23 / 40
28. Knowledge representation Modularization
Modularization
tractable subsets of OWL 2: EL, QL, RL
problem: identify a large (EL, QL, RL)-module of an OWL ontology
AbnormalityOfAppendix ≡
∃pheneOf .(¬∃hasPart.(Normal Appendix)) (
EL)
Z
Z
AbsentAppendix ≡ ∃pheneOf .(¬∃hasPart.Appendix) (
EL)
Z
Z
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 23 / 40
29. Knowledge representation Modularization
Modularization
tractable subsets of OWL 2: EL, QL, RL
problem: identify a large (EL, QL, RL)-module of an OWL ontology
AbnormalityOfAppendix ≡
∃pheneOf .(¬∃hasPart.(Normal Appendix)) (
EL)
Z
Z
AbsentAppendix ≡ ∃pheneOf .(¬∃hasPart.Appendix) (
EL)
Z
Z
Inference: AbsentAppendix AbnormalityOfAppendix (EL)
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 23 / 40
30. Knowledge representation Modularization
Modularization
EL Vira
http://el-vira.googlecode.com
ontology modularization
retain signature of ontology
identify EL, QL, RL axioms in deductive closure
completeness is open problem
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 24 / 40
31. Knowledge representation Modularization
Modularization
EL Module
AbnormalityOfAppendix ≡
∃pheneOf .(¬∃hasPart.(Normal Appendix))
AbsentAppendix ≡ ∃pheneOf .(¬∃hasPart.Appendix)
AbsentAppendix AbnormalityOfAppendix
H et al., 2011. A common layer of interoperability for biomedical ontologies based on OWL EL. Bioinformatics, 27(7), 1001–1008.
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 25 / 40
32. Knowledge representation Applications and evaluation
Phenotype alignments
PhenomeBLAST
apply to yeast, fly, worm, fish, mouse and human phenotypes
phenotype alignment through OWL reasoning
more than 300,000 classes and 1,000,000 axioms
combination of HermiT (for modularization), CB and CEL reasoner
classification time: 7 minutes
http://phenomeblast.googlecode.org
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 26 / 40
33. Knowledge representation Applications and evaluation
Phenotype alignments
PhenomeBLAST
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 27 / 40
34. Knowledge representation Applications and evaluation
Phenotype alignments
PhenomeBLAST
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 28 / 40
35. Knowledge representation Applications and evaluation
Application
Comparison of phenotypes
direct comparison of phenotypes:
disease phenotypes, e.g., tetralogy of Fallot
phenotypes associated with genetic mutations (genotypes in mouse,
fish, etc.)
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 29 / 40
36. Knowledge representation Applications and evaluation
Application
Comparison of phenotypes
phenotype of mutations subclass of disease phenotype allows inference of
gene-disease association if
disease phenotypes sufficient for having the disease
mutation phenotypes necessary for having a specific genotype
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 30 / 40
37. Knowledge representation Applications and evaluation
Application
Similarity-based comparison
pairwise comparison of phenotypes
semantic similarity: weighted Jaccard index
result: similarity matrix between phenotypes
(quantitative) evaluation based on predicting orthology, pathway,
disease
identify novel gene-disease associations
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 31 / 40
39. Knowledge representation Applications and evaluation
Application
Similarity-based comparison: gene-disease associations
Adam19 and Fgf15 genes in mice may be involved in Tetralogy of
Fallot
Aberrant pathways
Cytokine-cytokine receptor interaction pathway (ko04060) is
significantly correlated with Tetralogy of Fallot (p = 5 · 10−7 , Wilcoxon
signed-rank test)
Gene disease associations for orphan diseases
Slc34a1 (MGI:1345284) and Fanconi renotubular syndrome 1
(OMIM:134600)
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 33 / 40
40. Knowledge representation Applications and evaluation
Application
PhenomeBrowser
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 34 / 40
41. Conclusions
Summary
Aspects of ontology-based information systems in biology
knowledge representation language
expressiveness
non-monotonicity
complexity of inferences
ontological decisions
anatomy (parthood, connectedness)
physiology (function)
pathology, disease (normality, abnormality)
statistical/similarity-based framework
semantic similarity
account for incomplete information
account for noisy data
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 35 / 40
42. Conclusions
Challenges and future research
Knowledge representation
establish reasoning infrastructure (OWLlink, ...)
improve reasoning performance (OWL profiles, modularity,
approximate reasoning)
OWL reasoning with prototypes, non-monotonic reasoning, abduction
explore alternatives to OWL
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 36 / 40
43. Conclusions
Challenges and future research
Ontology
Individual
Physical object Quality Function Process
ChEBI Ontology Molecule
Gene
Sequence Ontology
Transcript
GO-CC Organelle
Celltype Gene Ontology Cell
Phenotype Tissue
Ontology Organ
Anatomy
Ontology
Body
Population
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 37 / 40
44. Conclusions
Challenges and future research
Biology
add phenotype information
20,000 knockout mice
dog, rat, slime mold, ...
define disease phenotypes
extension to other domains
functional genomics
pharmacology, drug discovery
systems biology
clinical research, decision support
quantifiable evaluation
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 38 / 40
45. Conclusions
Acknowledgements
John Gennari
George Gkoutos Pierre Grenon
Heinrich Herre Pascal Hitzler
Janet Kelso Frank Loebe
Michel Dumontier Anika Oellrich
Dietrich Kay Pruefer
Rebholz-Schuhmann Paul Schofield
Nico Adams Stefan Schulz
Dan Cook Robert Stevens
Bernard de Bono Sarala Wimalaratne
...
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 39 / 40
46. Conclusions
Thank you!
Robert Hoehndorf (University of Cambridge) Phenotype ontologies 21 June 2011 40 / 40