SlideShare una empresa de Scribd logo
1 de 29
Descargar para leer sin conexión
Semantic tools for
aggregation of morphological
characters across studies
James Balhoff, Alex Dececchi, Paula Mabee,
Hilmar Lapp, & Phenoscape team
Rich body of morphological
observations – mostly locked up

Zebrafish Model of Human Ectodermal Dysplasia

Figure 2. The dominant gene Nkt is phenotypically similar, however complements fls mutants. Nkt homozygotes show complete loss of
scales, teeth and gill rakers resembling the fls phenotype (A–C). Heterozygous Nkt zebrafish show an intermediate phenotype of scale loss and
patterning defect (arrows) while no effect on fin development is seen (D). Heterozygous Nkt also show a dominant effect on the number of teeth
(arrows, E) and gill rakers (F), showing deficiencies along the posterior branchial arches and formation of rudimentary rakers along ceratobranchial 1
and 2 (arrows, F). Cb1-5, ceratobranchial bones.
doi:10.1371/journal.pgen.1000206.g002

Table 1. Quantitative effect of fls on scale number and shape
and the effect of background modifiers in Danio rerio strains
on flsdt3Tpl.

and a cytoplasmic terminal death domain essential for protein
interactions with signaling adaptor complexes. The flste370f
mutation is an A to T transversion at a splice acceptor site,
Free text is a barrier to machinebased integration
Phylogenetic systematics

Human genetics

OMIM query
“large bone”
“enlarged bone”
“big bones”
“huge bones”
“massive bones”
“hyperplastic bones”
Lundberg & Akama 2005

“hyperplastic bone”
“bone hyperplasia”
“increased bone growth”

# of records
1083
224
21
4
41
12
45
181
879

http://www.ncbi.nlm.nih.gov/omim
Integration is
key for
knowledge
synthesis

The Tree of Life and a New Classification of Bony Fishes
—Betancur-R. et al. 2013. PLoS Currents Tree of Life
Integration is key for discovery
Phenoscape: making evolutionary
morphology computable

+
Comparative studies

Model organism datasets

= Phenoscape Knowledgebase
How it works: shared ontologies,
rich semantics, OWL reasoning
Phenoscape KB content
16,000 character states from >120 comparative
morphological datasets, linked to 4,000 vertebrate
taxa.
Imported genetic phenotype and expression data
from ZFIN, Xenbase, MGI, and Human Phenotype
project.
Shared semantics: Uberon (anatomy), PATO
(phenotypic qualities), Entity–Quality (EQ) OWL
axioms (phenotype observations)
Plus a dozen other ontologies ...
Integrative querying with the
Phenoscape KB: scale, absent
Ictalurus punctatus

eda gene in Danio rerio

“body: naked”—Kailola, P. J. 2004. A
phylogenetic exploration of the catfish family
Ariidae (Otophysi; Siluriformes). The Beagle,
Records of the Museums and Art Galleries of the
Northern Territory 20:87-166

edadt3S243X/dt3S243X — Harris, M.P., Rohner, N.,
Schwarz, H., Perathoner, S., Konstantinidis, P.,
and Nüsslein-Volhard, C.. 2008. Zebrafish eda
and edar mutants reveal conserved and
ancestral roles of ectodysplasin signaling in
vertebrates. PLoS Genetics 4(10):e1000206.
Integrating phylogenetic studies
Can we use reasoning to integrate character
matrices across studies?
Would enable the wealth of single-study character
analysis methods on any integrated matrix.
Including tree-based comparative phylogenetic
methods
Evolution of Sarcopterygian Limb/Fin
Combined matrix of any character states related to
presence/absence of limb/fin structures from
studies in Phenoscape KB

Clack, J. A. (2009). The Fin to Limb Transition: New Data, Interpretations, and Hypotheses from Paleontology and Developmental Biology. Annual
Review of Earth and Planetary Sciences, 37(1), 163-179
EQ supermatrix synthesis:
workflow
1. Use OWL reasoner to group character states by
anatomy and quality axes, based on EQ annotations.
2. Export groupings as character matrix, with taxon
assignments to states from original data.
3. Supplement presence/absence character state
assertions with reasoner-inferred information.
4. Use Phenex data editor to manually consolidate
character states where appropriate
EQ supermatrix synthesis:
Results
Synthesized limb/fin character matrix
1055 Sarcopterygian taxa
494 characters
2-7 states per character
from 55 original studies
Developed several tools for automated character
matrix synthesis to make this happen.
Technology stack
Ontologies and phenotype observation data in
OWL
ELK, an OWL-EL reasoner
OWL-DL reasoners are too slow for this
OWL API (Java), programmed primarily using
Scala
Bigdata™ RDF triplestore (~ 25 million triples)
Using reasoning to group
character states
For every pair of anatomical term X and quality
attribute Y, generate a “character expression” OWL
class: (involves some X and involves some Y)
Done programmatically via property chain axioms
and OWL reasoning (ELK)
Classify character states to most relevant character
expression
Done by OWL reasoner (ELK)
Inferred relationships materialized to triple store
Challenge: scalable reasoning
Anatomy ontologies and EQ annotation employ
rich OWL semantics → best used with a DL reasoner
Classifying and querying over large dataset (~25
million RDF triples) does not scale well
Presently, the only feasible OWL reasoner is ELK
constrained to OWL EL profile → limits kinds of
expressions we use
best performance over class axioms only →
data must be modeled so as to avoid need for
classifying instances
Challenge: Querying complex
expressions
Want to allow arbitrary selection of structures of
interest, using rich semantics:
(part_of some (limb/fin or girdle skeleton)) or
(connected_to some girdle skeleton)
RDF triplestores provide very limited reasoning
expressivity, and scale poorly with large ontologies.
However, ELK can answer class expression queries
within seconds.
Instead of something like this (*):
PREFIX	
  rdf:	
  <http://www.w3.org/1999/02/22-­‐rdf-­‐syntax-­‐ns#>
PREFIX	
  rdfs:	
  <http://www.w3.org/2000/01/rdf-­‐schema#>
PREFIX	
  ao:	
  <http://purl.obolibrary.org/obo/my-­‐anatomy-­‐ontology/>
PREFIX	
  owl:	
  <http://www.w3.org/2002/07/owl#>
SELECT	
  DISTINCT	
  ?gene
WHERE	
  
{
?gene	
  ao:expressed_in	
  ?structure	
  .
?structure	
  rdf:type	
  ?structure_class	
  .
#	
  Triple	
  pattern	
  selecting	
  structure:
?structure_class	
  rdfs:subClassOf	
  "ao:muscle”	
  .
?structure_class	
  rdfs:subClassOf	
  ?restriction
?restriction	
  owl:onProperty	
  ao:part_of	
  .
?restriction	
  owl:someValuesFrom	
  "ao:head"	
  .
}

We would really like to do this:
PREFIX	
  rdf:	
  <http://www.w3.org/1999/02/22-­‐rdf-­‐syntax-­‐ns#>
PREFIX	
  rdfs:	
  <http://www.w3.org/2000/01/rdf-­‐schema#>
PREFIX	
  ao:	
  <http://purl.obolibrary.org/obo/my-­‐anatomy-­‐ontology/>
PREFIX	
  ow:	
  <http://purl.org/phenoscape/owlet/syntax#>
SELECT	
  DISTINCT	
  ?gene
WHERE	
  
{
?gene	
  ao:expressed_in	
  ?structure	
  .
?structure	
  rdf:type	
  ?structure_class	
  .
#	
  Triple	
  pattern	
  containing	
  an	
  OWL	
  expression:
?structure_class	
  rdfs:subClassOf	
  "ao:muscle	
  and	
  (ao:part_of	
  some	
  ao:head)"^^ow:omn	
  .
}
owlet: SPARQL query expansion
with in-memory OWL reasoner
owlet interprets OWL class expressions
embedded within SPARQL queries
Uses any OWL API-based reasoner to preprocess
query.
We use ELK that holds terminology in memory.
Replaces OWL expression with FILTER statement
listing matching terms
https://github.com/phenoscape/owlet
PREFIX	
  rdf:	
  <http://www.w3.org/1999/02/22-­‐rdf-­‐syntax-­‐ns#>
PREFIX	
  rdfs:	
  <http://www.w3.org/2000/01/rdf-­‐schema#>
PREFIX	
  ao:	
  <http://purl.obolibrary.org/obo/my-­‐anatomy-­‐ontology/>
PREFIX	
  ow:	
  <http://purl.org/phenoscape/owlet/syntax#>
SELECT	
  DISTINCT	
  ?gene
WHERE	
  
{
?gene	
  ao:expressed_in	
  ?structure	
  .
?structure	
  rdf:type	
  ?structure_class	
  .
#	
  Triple	
  pattern	
  containing	
  an	
  OWL	
  expression:
?structure_class	
  rdfs:subClassOf	
  "ao:muscle	
  and	
  (ao:part_of	
  some	
  ao:head)"^^ow:omn	
  .
}

➡︎
owlet
➡︎
PREFIX	
  rdf:	
  <http://www.w3.org/1999/02/22-­‐rdf-­‐syntax-­‐ns#>
PREFIX	
  rdfs:	
  <http://www.w3.org/2000/01/rdf-­‐schema#>
PREFIX	
  ao:	
  <http://purl.obolibrary.org/obo/my-­‐anatomy-­‐ontology/>
PREFIX	
  ow:	
  <http://purl.org/phenoscape/owlet/syntax#>
SELECT	
  DISTINCT	
  ?gene
WHERE	
  
{
?gene	
  ao:expressed_in	
  ?structure	
  .
?structure	
  rdf:type	
  ?structure_class	
  .
#	
  Filter	
  constraining	
  ?structure_class	
  to	
  the	
  terms	
  returned	
  by	
  the	
  OWL	
  query:
FILTER(?structure_class	
  IN	
  (ao:adductor_mandibulae,	
  ao:constrictor_dorsalis,	
  ...))
}
Inferring presence/absence
Character states often do not directly assert, but
imply presence or absence.
Most phenotypic descriptions of some feature of a
structure implies its presence or absence:
“Humerus slender and elongate: with length more than three
times the diameter of its distal end” → humerus must be
present

Partonomy axioms in the ontology allow inferring
presence or absence:
‘all humerus part_of some forelimb’ → forelimb must be
present if humerus is; humerus must be absent if forelimb is
Absence is typically
modeled using negation
→ not (has_part some
forelimb)
Negation not part of OWL
EL (and thus ELK reasoner)

C = has_part
some appendage

︎
B = has_part
some limb

︎

—————reverse—————

Challenge: absence reasoning
with OWL EL
absentA =
not A

︎
absentB =
not B

︎

Solution: programmatic
A = has_part
absentC =
assertion of “absence
some forelimb
not C
hierarchy” via classification
of negated expressions
Requires precomputation, constraints for on-the-fly use
Challenge: Character state consolidation
Challenge: Character state consolidation
Reduced 1-297 states per
character to 2-7.
Result: Reasoning fills in many
missing character states
asserted presence/absence

with inference

Mesquite “birds-eye view”
Unified matrix enables candidate gene view
Linking evolutionary phenotypes to genes through
ontologies, via Phenoscape KB or similarity
Integrated data highlight
conflict and gaps
Conflicting interpretations in studies
supinator process of humerus: both absent &
present in Strepsodus (Zhu et al. 1999 vs.
Ruta 2011)

figure from Parker et al., 2005

Gaps in knowledge
acetabulum present or absent?

Acetabulum of pelvic
girdle: present/absent

Same term, different meaning?
Acanthostega— “radials, jointed” (Swartz
2012)
but doesn’t have radials...
Uneven taxon sampling
http://characterdesignnotes.blogspot.com/2011/04/proper-use-of-reference-and-anatomy-in.html
Phenoscape software
https://github.com/phenoscape
owlet (SPARQL processor), Phenex (semantic
data editor), phenoscape-owl-tools (KB build),
others
http://phenoscape.org/wiki/Software
Phenoscape project team
National Evolutionary Synthesis Center
(NESCent)		

University of Oregon (Zebrafish Information
Network)	

Todd Vision (also University of North
Carolina at Chapel Hill)

Monte Westerfield

Hilmar Lapp

Ceri Van Slyke	 	

Jim Balhoff

Cincinnati Children's Hospital (Xenbase)

Prashanti Manda	
University of South Dakota	
Paula Mabee

David Blackburn
	

Paul Sereno
Nizar Ibrahim
Mouse Genome Informatics	
Terry Hayamizu

Christina James-Zorn
California Academy of Sciences	

Alex Dececchi	

Judith Blake

Aaron Zorn
Virgilio Ponferrada

Wasila Dahdul
University of Chicago	

Yvonne Bradford

University of Arizona	
Hong Cui
Oregon Health & Science University	
Melissa Haendel
Lawrence Berkeley National Labs
Chris Mungall

Más contenido relacionado

Similar a Semantic tools for aggregation of morphological characters across studies

Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...
Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...
Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...Hilmar Lapp
 
GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...
GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...
GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...Neo4j
 
Getting Started with the Hymenoptera Anatomical Ontology
Getting Started with the Hymenoptera Anatomical OntologyGetting Started with the Hymenoptera Anatomical Ontology
Getting Started with the Hymenoptera Anatomical OntologyKatja C. Seltmann
 
Ontology-based data access and semantic mining with Aber-OWL
Ontology-based data access and semantic mining with Aber-OWLOntology-based data access and semantic mining with Aber-OWL
Ontology-based data access and semantic mining with Aber-OWLRobert Hoehndorf
 
Tutorial OWL and drug discovery ICBO 2013
Tutorial OWL and drug discovery ICBO 2013Tutorial OWL and drug discovery ICBO 2013
Tutorial OWL and drug discovery ICBO 2013Samuel Croset
 
Franz 2014 ESA Aligning Insect Phylogenies Perelleschus and Other Cases
Franz 2014 ESA Aligning Insect Phylogenies Perelleschus and Other CasesFranz 2014 ESA Aligning Insect Phylogenies Perelleschus and Other Cases
Franz 2014 ESA Aligning Insect Phylogenies Perelleschus and Other Casestaxonbytes
 
Semantic Modelling using Semantic Web Technology
Semantic Modelling using Semantic Web TechnologySemantic Modelling using Semantic Web Technology
Semantic Modelling using Semantic Web TechnologyRinke Hoekstra
 
Semantic Web: From Representations to Applications
Semantic Web: From Representations to ApplicationsSemantic Web: From Representations to Applications
Semantic Web: From Representations to ApplicationsGuus Schreiber
 
Adapt OWL as a Modular Ontology Language
Adapt OWL as a Modular Ontology LanguageAdapt OWL as a Modular Ontology Language
Adapt OWL as a Modular Ontology LanguageJie Bao
 
Functional and Structural Models of Commonsense Reasoning in Cognitive Archit...
Functional and Structural Models of Commonsense Reasoning in Cognitive Archit...Functional and Structural Models of Commonsense Reasoning in Cognitive Archit...
Functional and Structural Models of Commonsense Reasoning in Cognitive Archit...Antonio Lieto
 
Drug-discovery knowledge integration and analysis using OWL and reasoners
Drug-discovery knowledge integration and analysis using OWL and reasonersDrug-discovery knowledge integration and analysis using OWL and reasoners
Drug-discovery knowledge integration and analysis using OWL and reasonersSamuel Croset
 
Essential Biology 4.3 Theoretical Genetics
Essential Biology 4.3 Theoretical GeneticsEssential Biology 4.3 Theoretical Genetics
Essential Biology 4.3 Theoretical GeneticsStephen Taylor
 
Investigating Term Reuse and Overlap in Biomedical Ontologies
Investigating Term Reuse and Overlap in Biomedical OntologiesInvestigating Term Reuse and Overlap in Biomedical Ontologies
Investigating Term Reuse and Overlap in Biomedical OntologiesMaulik Kamdar
 
247th ACS Meeting: Experiment Markup Language (ExptML)
247th ACS Meeting: Experiment Markup Language (ExptML)247th ACS Meeting: Experiment Markup Language (ExptML)
247th ACS Meeting: Experiment Markup Language (ExptML)Stuart Chalk
 
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...Chris Mungall
 
Computing with phenotypic diversity using semantic descriptions
Computing with phenotypic diversity using semantic descriptionsComputing with phenotypic diversity using semantic descriptions
Computing with phenotypic diversity using semantic descriptionsbalhoff
 
A Semantic Importing Approach to Knowledge Reuse from Multiple Ontologies
A Semantic Importing Approach to Knowledge Reuse from Multiple OntologiesA Semantic Importing Approach to Knowledge Reuse from Multiple Ontologies
A Semantic Importing Approach to Knowledge Reuse from Multiple OntologiesJie Bao
 
Package-based Description Logics – Preliminary Results
Package-based Description Logics – Preliminary ResultsPackage-based Description Logics – Preliminary Results
Package-based Description Logics – Preliminary ResultsJie Bao
 

Similar a Semantic tools for aggregation of morphological characters across studies (20)

Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...
Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...
Towards ubiquitous OWL computing: Simplifying programmatic authoring of and q...
 
GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...
GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...
GraphConnect Europe 2016 - Building a Repository of Biomedical Ontologies wit...
 
Getting Started with the Hymenoptera Anatomical Ontology
Getting Started with the Hymenoptera Anatomical OntologyGetting Started with the Hymenoptera Anatomical Ontology
Getting Started with the Hymenoptera Anatomical Ontology
 
Ontology-based data access and semantic mining with Aber-OWL
Ontology-based data access and semantic mining with Aber-OWLOntology-based data access and semantic mining with Aber-OWL
Ontology-based data access and semantic mining with Aber-OWL
 
Tutorial OWL and drug discovery ICBO 2013
Tutorial OWL and drug discovery ICBO 2013Tutorial OWL and drug discovery ICBO 2013
Tutorial OWL and drug discovery ICBO 2013
 
Franz 2014 ESA Aligning Insect Phylogenies Perelleschus and Other Cases
Franz 2014 ESA Aligning Insect Phylogenies Perelleschus and Other CasesFranz 2014 ESA Aligning Insect Phylogenies Perelleschus and Other Cases
Franz 2014 ESA Aligning Insect Phylogenies Perelleschus and Other Cases
 
Semantic Modelling using Semantic Web Technology
Semantic Modelling using Semantic Web TechnologySemantic Modelling using Semantic Web Technology
Semantic Modelling using Semantic Web Technology
 
Semantic Web: From Representations to Applications
Semantic Web: From Representations to ApplicationsSemantic Web: From Representations to Applications
Semantic Web: From Representations to Applications
 
Adapt OWL as a Modular Ontology Language
Adapt OWL as a Modular Ontology LanguageAdapt OWL as a Modular Ontology Language
Adapt OWL as a Modular Ontology Language
 
Functional and Structural Models of Commonsense Reasoning in Cognitive Archit...
Functional and Structural Models of Commonsense Reasoning in Cognitive Archit...Functional and Structural Models of Commonsense Reasoning in Cognitive Archit...
Functional and Structural Models of Commonsense Reasoning in Cognitive Archit...
 
Drug-discovery knowledge integration and analysis using OWL and reasoners
Drug-discovery knowledge integration and analysis using OWL and reasonersDrug-discovery knowledge integration and analysis using OWL and reasoners
Drug-discovery knowledge integration and analysis using OWL and reasoners
 
Essential Biology 4.3 Theoretical Genetics
Essential Biology 4.3 Theoretical GeneticsEssential Biology 4.3 Theoretical Genetics
Essential Biology 4.3 Theoretical Genetics
 
Knowledge Extraction
Knowledge ExtractionKnowledge Extraction
Knowledge Extraction
 
Project proposal for a fishery ontology service
Project proposal for a fishery ontology serviceProject proposal for a fishery ontology service
Project proposal for a fishery ontology service
 
Investigating Term Reuse and Overlap in Biomedical Ontologies
Investigating Term Reuse and Overlap in Biomedical OntologiesInvestigating Term Reuse and Overlap in Biomedical Ontologies
Investigating Term Reuse and Overlap in Biomedical Ontologies
 
247th ACS Meeting: Experiment Markup Language (ExptML)
247th ACS Meeting: Experiment Markup Language (ExptML)247th ACS Meeting: Experiment Markup Language (ExptML)
247th ACS Meeting: Experiment Markup Language (ExptML)
 
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
 
Computing with phenotypic diversity using semantic descriptions
Computing with phenotypic diversity using semantic descriptionsComputing with phenotypic diversity using semantic descriptions
Computing with phenotypic diversity using semantic descriptions
 
A Semantic Importing Approach to Knowledge Reuse from Multiple Ontologies
A Semantic Importing Approach to Knowledge Reuse from Multiple OntologiesA Semantic Importing Approach to Knowledge Reuse from Multiple Ontologies
A Semantic Importing Approach to Knowledge Reuse from Multiple Ontologies
 
Package-based Description Logics – Preliminary Results
Package-based Description Logics – Preliminary ResultsPackage-based Description Logics – Preliminary Results
Package-based Description Logics – Preliminary Results
 

Último

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 

Último (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

Semantic tools for aggregation of morphological characters across studies

  • 1. Semantic tools for aggregation of morphological characters across studies James Balhoff, Alex Dececchi, Paula Mabee, Hilmar Lapp, & Phenoscape team
  • 2. Rich body of morphological observations – mostly locked up Zebrafish Model of Human Ectodermal Dysplasia Figure 2. The dominant gene Nkt is phenotypically similar, however complements fls mutants. Nkt homozygotes show complete loss of scales, teeth and gill rakers resembling the fls phenotype (A–C). Heterozygous Nkt zebrafish show an intermediate phenotype of scale loss and patterning defect (arrows) while no effect on fin development is seen (D). Heterozygous Nkt also show a dominant effect on the number of teeth (arrows, E) and gill rakers (F), showing deficiencies along the posterior branchial arches and formation of rudimentary rakers along ceratobranchial 1 and 2 (arrows, F). Cb1-5, ceratobranchial bones. doi:10.1371/journal.pgen.1000206.g002 Table 1. Quantitative effect of fls on scale number and shape and the effect of background modifiers in Danio rerio strains on flsdt3Tpl. and a cytoplasmic terminal death domain essential for protein interactions with signaling adaptor complexes. The flste370f mutation is an A to T transversion at a splice acceptor site,
  • 3. Free text is a barrier to machinebased integration Phylogenetic systematics Human genetics OMIM query “large bone” “enlarged bone” “big bones” “huge bones” “massive bones” “hyperplastic bones” Lundberg & Akama 2005 “hyperplastic bone” “bone hyperplasia” “increased bone growth” # of records 1083 224 21 4 41 12 45 181 879 http://www.ncbi.nlm.nih.gov/omim
  • 4. Integration is key for knowledge synthesis The Tree of Life and a New Classification of Bony Fishes —Betancur-R. et al. 2013. PLoS Currents Tree of Life
  • 5. Integration is key for discovery
  • 6. Phenoscape: making evolutionary morphology computable + Comparative studies Model organism datasets = Phenoscape Knowledgebase
  • 7. How it works: shared ontologies, rich semantics, OWL reasoning
  • 8. Phenoscape KB content 16,000 character states from >120 comparative morphological datasets, linked to 4,000 vertebrate taxa. Imported genetic phenotype and expression data from ZFIN, Xenbase, MGI, and Human Phenotype project. Shared semantics: Uberon (anatomy), PATO (phenotypic qualities), Entity–Quality (EQ) OWL axioms (phenotype observations) Plus a dozen other ontologies ...
  • 9. Integrative querying with the Phenoscape KB: scale, absent Ictalurus punctatus eda gene in Danio rerio “body: naked”—Kailola, P. J. 2004. A phylogenetic exploration of the catfish family Ariidae (Otophysi; Siluriformes). The Beagle, Records of the Museums and Art Galleries of the Northern Territory 20:87-166 edadt3S243X/dt3S243X — Harris, M.P., Rohner, N., Schwarz, H., Perathoner, S., Konstantinidis, P., and Nüsslein-Volhard, C.. 2008. Zebrafish eda and edar mutants reveal conserved and ancestral roles of ectodysplasin signaling in vertebrates. PLoS Genetics 4(10):e1000206.
  • 10. Integrating phylogenetic studies Can we use reasoning to integrate character matrices across studies? Would enable the wealth of single-study character analysis methods on any integrated matrix. Including tree-based comparative phylogenetic methods
  • 11. Evolution of Sarcopterygian Limb/Fin Combined matrix of any character states related to presence/absence of limb/fin structures from studies in Phenoscape KB Clack, J. A. (2009). The Fin to Limb Transition: New Data, Interpretations, and Hypotheses from Paleontology and Developmental Biology. Annual Review of Earth and Planetary Sciences, 37(1), 163-179
  • 12. EQ supermatrix synthesis: workflow 1. Use OWL reasoner to group character states by anatomy and quality axes, based on EQ annotations. 2. Export groupings as character matrix, with taxon assignments to states from original data. 3. Supplement presence/absence character state assertions with reasoner-inferred information. 4. Use Phenex data editor to manually consolidate character states where appropriate
  • 13. EQ supermatrix synthesis: Results Synthesized limb/fin character matrix 1055 Sarcopterygian taxa 494 characters 2-7 states per character from 55 original studies Developed several tools for automated character matrix synthesis to make this happen.
  • 14. Technology stack Ontologies and phenotype observation data in OWL ELK, an OWL-EL reasoner OWL-DL reasoners are too slow for this OWL API (Java), programmed primarily using Scala Bigdata™ RDF triplestore (~ 25 million triples)
  • 15. Using reasoning to group character states For every pair of anatomical term X and quality attribute Y, generate a “character expression” OWL class: (involves some X and involves some Y) Done programmatically via property chain axioms and OWL reasoning (ELK) Classify character states to most relevant character expression Done by OWL reasoner (ELK) Inferred relationships materialized to triple store
  • 16. Challenge: scalable reasoning Anatomy ontologies and EQ annotation employ rich OWL semantics → best used with a DL reasoner Classifying and querying over large dataset (~25 million RDF triples) does not scale well Presently, the only feasible OWL reasoner is ELK constrained to OWL EL profile → limits kinds of expressions we use best performance over class axioms only → data must be modeled so as to avoid need for classifying instances
  • 17. Challenge: Querying complex expressions Want to allow arbitrary selection of structures of interest, using rich semantics: (part_of some (limb/fin or girdle skeleton)) or (connected_to some girdle skeleton) RDF triplestores provide very limited reasoning expressivity, and scale poorly with large ontologies. However, ELK can answer class expression queries within seconds.
  • 18. Instead of something like this (*): PREFIX  rdf:  <http://www.w3.org/1999/02/22-­‐rdf-­‐syntax-­‐ns#> PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-­‐schema#> PREFIX  ao:  <http://purl.obolibrary.org/obo/my-­‐anatomy-­‐ontology/> PREFIX  owl:  <http://www.w3.org/2002/07/owl#> SELECT  DISTINCT  ?gene WHERE   { ?gene  ao:expressed_in  ?structure  . ?structure  rdf:type  ?structure_class  . #  Triple  pattern  selecting  structure: ?structure_class  rdfs:subClassOf  "ao:muscle”  . ?structure_class  rdfs:subClassOf  ?restriction ?restriction  owl:onProperty  ao:part_of  . ?restriction  owl:someValuesFrom  "ao:head"  . } We would really like to do this: PREFIX  rdf:  <http://www.w3.org/1999/02/22-­‐rdf-­‐syntax-­‐ns#> PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-­‐schema#> PREFIX  ao:  <http://purl.obolibrary.org/obo/my-­‐anatomy-­‐ontology/> PREFIX  ow:  <http://purl.org/phenoscape/owlet/syntax#> SELECT  DISTINCT  ?gene WHERE   { ?gene  ao:expressed_in  ?structure  . ?structure  rdf:type  ?structure_class  . #  Triple  pattern  containing  an  OWL  expression: ?structure_class  rdfs:subClassOf  "ao:muscle  and  (ao:part_of  some  ao:head)"^^ow:omn  . }
  • 19. owlet: SPARQL query expansion with in-memory OWL reasoner owlet interprets OWL class expressions embedded within SPARQL queries Uses any OWL API-based reasoner to preprocess query. We use ELK that holds terminology in memory. Replaces OWL expression with FILTER statement listing matching terms https://github.com/phenoscape/owlet
  • 20. PREFIX  rdf:  <http://www.w3.org/1999/02/22-­‐rdf-­‐syntax-­‐ns#> PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-­‐schema#> PREFIX  ao:  <http://purl.obolibrary.org/obo/my-­‐anatomy-­‐ontology/> PREFIX  ow:  <http://purl.org/phenoscape/owlet/syntax#> SELECT  DISTINCT  ?gene WHERE   { ?gene  ao:expressed_in  ?structure  . ?structure  rdf:type  ?structure_class  . #  Triple  pattern  containing  an  OWL  expression: ?structure_class  rdfs:subClassOf  "ao:muscle  and  (ao:part_of  some  ao:head)"^^ow:omn  . } ➡︎ owlet ➡︎ PREFIX  rdf:  <http://www.w3.org/1999/02/22-­‐rdf-­‐syntax-­‐ns#> PREFIX  rdfs:  <http://www.w3.org/2000/01/rdf-­‐schema#> PREFIX  ao:  <http://purl.obolibrary.org/obo/my-­‐anatomy-­‐ontology/> PREFIX  ow:  <http://purl.org/phenoscape/owlet/syntax#> SELECT  DISTINCT  ?gene WHERE   { ?gene  ao:expressed_in  ?structure  . ?structure  rdf:type  ?structure_class  . #  Filter  constraining  ?structure_class  to  the  terms  returned  by  the  OWL  query: FILTER(?structure_class  IN  (ao:adductor_mandibulae,  ao:constrictor_dorsalis,  ...)) }
  • 21. Inferring presence/absence Character states often do not directly assert, but imply presence or absence. Most phenotypic descriptions of some feature of a structure implies its presence or absence: “Humerus slender and elongate: with length more than three times the diameter of its distal end” → humerus must be present Partonomy axioms in the ontology allow inferring presence or absence: ‘all humerus part_of some forelimb’ → forelimb must be present if humerus is; humerus must be absent if forelimb is
  • 22. Absence is typically modeled using negation → not (has_part some forelimb) Negation not part of OWL EL (and thus ELK reasoner) C = has_part some appendage ︎ B = has_part some limb ︎ —————reverse————— Challenge: absence reasoning with OWL EL absentA = not A ︎ absentB = not B ︎ Solution: programmatic A = has_part absentC = assertion of “absence some forelimb not C hierarchy” via classification of negated expressions Requires precomputation, constraints for on-the-fly use
  • 23. Challenge: Character state consolidation
  • 24. Challenge: Character state consolidation Reduced 1-297 states per character to 2-7.
  • 25. Result: Reasoning fills in many missing character states asserted presence/absence with inference Mesquite “birds-eye view”
  • 26. Unified matrix enables candidate gene view Linking evolutionary phenotypes to genes through ontologies, via Phenoscape KB or similarity
  • 27. Integrated data highlight conflict and gaps Conflicting interpretations in studies supinator process of humerus: both absent & present in Strepsodus (Zhu et al. 1999 vs. Ruta 2011) figure from Parker et al., 2005 Gaps in knowledge acetabulum present or absent? Acetabulum of pelvic girdle: present/absent Same term, different meaning? Acanthostega— “radials, jointed” (Swartz 2012) but doesn’t have radials... Uneven taxon sampling http://characterdesignnotes.blogspot.com/2011/04/proper-use-of-reference-and-anatomy-in.html
  • 28. Phenoscape software https://github.com/phenoscape owlet (SPARQL processor), Phenex (semantic data editor), phenoscape-owl-tools (KB build), others http://phenoscape.org/wiki/Software
  • 29. Phenoscape project team National Evolutionary Synthesis Center (NESCent) University of Oregon (Zebrafish Information Network) Todd Vision (also University of North Carolina at Chapel Hill) Monte Westerfield Hilmar Lapp Ceri Van Slyke Jim Balhoff Cincinnati Children's Hospital (Xenbase) Prashanti Manda University of South Dakota Paula Mabee David Blackburn Paul Sereno Nizar Ibrahim Mouse Genome Informatics Terry Hayamizu Christina James-Zorn California Academy of Sciences Alex Dececchi Judith Blake Aaron Zorn Virgilio Ponferrada Wasila Dahdul University of Chicago Yvonne Bradford University of Arizona Hong Cui Oregon Health & Science University Melissa Haendel Lawrence Berkeley National Labs Chris Mungall