SlideShare una empresa de Scribd logo
1 de 22
Developing an application ontology
   for biomedical resource annotation
             and retrieval:
     challenges and lessons learned

     C. Torniai, M. Brush, N. Vasilevsky, E. Segerdell,
M. Wilson, T. Johnson, K. Corday, C. Shaffer and M. Haendel

                         ICBO 2011
Outline
 eagle-i project
    Aims
    Ontology role
 eagle-i ontology
    Requirements
    Implementation
 Implementation choices
 Challenges


 consortium
eagle-i
NIH funded pilot project working to make scientific resources more
visible via a federated network of nine institutional repositories
Index invisible resources
reagents, protocols, techniques, instruments,
 expertise, organisms, software, training, human
 studies, biological specimens, etc.

Enable discovery by implementing
 semantic relationships between resources
Make data available using ontology-driven
 approach to research resource annotation
 and discovery
 Facilitate development of shared
 semantic entities that can be referenced in
 publications, databases, experiments, etc.


     consortium
eagle-i ontology: development drivers
 1) Represent collected resource information
 2) Use the ontology to control the data collection and
    search applications user-interface (UI) and logic
 3) Build a set of ontologies that are reusable and
    interoperable with other ontologies and existing
    efforts for representing biomedical entities.
    - Follow OBO Foundry orthogonality principle
    - Best practices for biomedical ontology development
    - Engage in discussions within the bio-ontology and
      resource discovery community (alignment with similar
      efforts NIF, BRO, VIVO)


   consortium
Ontology role in eagle-i architecture
   NIF,
   PubMedEnt                              Search Application
   rezGene


                      Federated Network
                                                               eagle-
                                                               iontologies


     Repositories
     (RDF)
Data Collection
 Application
                                                                 Glossary
    Resource                                                    Application
    information
    collection




         consortium
Modeling approach
• Preliminary data collection to identify key properties for each
  resource type
• Set of 300 queries used identify relationships between
  resources
   – “Which laboratories in the United States are equipped with high-
     resolution ultrasound machines for brachial artery reactivity testing
     (BART)?”
   – “Find in situ hybridization protocols for whole-mount preparations of
     Aplysia.”




• Develop initial model with a set resource classes and
  properties created de novo and from existent ontologies


    consortium
Implementation

   Ontology/Method                   Scope/Purpose
Basic Formal Ontology (BFO)           Upper ontology
Information Artifact Ontology
                                    Ontology metadata
            (IAO)
   Relation Ontology (RO)           Common properties
 Minimum Information to         Reuse classes and properties
Represent External Ontology       from external ontologies
     Terms (MIREOT)
Ontology layers
Goal: to decouple research resources representation from
information used for application appearance and behavior
                      Application specific module
                        Classes, annotation properties
                        and individuals required to
                        drive the UIs
                      eagle-i core ontology
                         Classes and properties used to
                         represent information about
                         biomedical research resources
                      MIREOTed ontology files
                        Externally sourced classes and
  consortium
                        properties
eagle-i core and mireoted sources
                eagle-i core ontology: 1283 classes, 56
                  object properties, and 61 data
                  properties.

              External Ontologies        Purpose/subsets            Classes
                   Ontology of
                                      research material entities,
                   Biomedical                                        509
                                          processes, devices
               Investigations (OBI)

                 NCBITaxonomy              Organisms taxa            192

                                        people, organization,
                 VIVO ontology                                        20
                                           publications

               Ontology of Clinical   human study designs and
                                                                      19
                Research (OCRe)               facets
              Biomdedical Resource
                                             instruments              13
                 Ontology (BRO)
 consortium
Application-specific module
Contains properties and classes required to drive the UIs of
the data collection and search applications

                              UI Annotations file
                              Holds annotations made on core eagle-i
                              and MIREOTed classes and properties
                              using these annotation values and
                              additional properties


                              UI Annotation Definition file
                              Definition UI annotation properties and
                              possible instance values for these
                              properties




 consortium
Application-specific module design
               pattern
– Values for the annotation properties ‘inClassGroup’ and
  ‘inPropertyGroup’




– An example of a resource annotated as a “Primary Resource Type”




   consortium
Examples of ‘inClassGroup’ and
            ‘inPropertyGroup’ values
       Label                Description                 Example
                           Denotes classes for
                                                          instrument’,
Primary Resource Type     which instances are
                                                  ‘biospecimen’, ‘protocol’
                                collected
                           Denotes classes or
                                                      BFO classes such
                         properties that are not
                                                        ‘continuant’ or
 Data Model Exclude      included in the model
                                                      ‘occurrent’ or RO
                        used for the data tool or
                                                  relations such ‘precedes’
                           the search tool UIs
                          Denotes a class for      ‘antibody immunogen’
                        which instances can only        created within
   Embedded Class          be created in the        ‘antibody’, ‘construct
                             context of an          insert’ created within
                           embedding class                 ‘plasmid’

  consortium
Additional application-specific properties
      Label               Description                  Example              Property
                                                                              Type
                          Used to specify the           Value set to
                       domain of an imported          “OBI_0000245”
  eagle-i domain      property. Each annotation     (‘organization’) for    Data Property
    constraint        will contain the URI of one       RO property
                                  class                ‘location_of’’
                      Used to specify the range          Value set to
                      of an imported property.         “ERO_0000004”
   eagle-i range                                                            Data Property
                        Each annotation will        (‘instrument’) for RO
    constraint
                       contain the URI of one              property
                                class                    ‘located_in’

                         Defines the value of            Capitalized
  eagle-i preferred   preferred label to display     ‘Organization’ for      Annotation
        label         in the data collection tool      OBI_0000245            Property
                            and search UIs            (‘organization’)
     consortium
Data Collection
                             Application




                    ‘eagle-i preferred
                    definition’ is used
                    for tooltips


Classes annotated         ‘eagle-i preferred
with ‘primary            label’ is used for
resource type’           the display name
                         Property annotated
                         as ‘’primary
                         property’




    Construct
    insert is an             Technique is
    example of a             annotated as
    resource                 ‘referenced
    annotated as             taxonomy’
    an ‘embedded
Challenges and benefits

 Reuse of existent ontologies
 Ontology Layers
     Application-specific module
 Community coordination and alignment
 Best practices and tools




consortium
Reuse of existent ontologies
Using BFO as upper level ontology and the relation ontology (RO)

Advantages
– Integration with other ontologies
– Ease the design process
– data integration and publication (Linked Open Data)


Challenges
– Need to exclude some classes (continuant, occurrent) from UI
  visualization after the inferred module has computed
– Non all relevant ontologies are built using BFO
– Domain and Range in RO not specified or not specific enough for an
  application


   For the future we plan to use OWL2 axiom annotation


consortium
Ontology layers
Advantages
– effective means to drive an application UI while
  maintaining interoperability with external ontologies and
  data sources
– separate what is relevant to share with the community
  from what was specific to an application
– facilitate parallel concurrent development

Challenges
– significant effort to keep the annotations current with the
  core module
– risk of excessive proliferation of annotation properties as
  quick way to simplify application development complexity.

consortium
Application-specific module
 We identified a common set of requirements
  for bridging the gap between an application
  and domain-specific ontologies
  – application-specific labels and definitions
  – exclusion of sets of classes and properties from the model
    used by the application
  – restriction of domain and range for some imported
    properties
  – definition of display order of object and data properties at
    class level



  consortium
Community coordination
 Commitment to collaboration with similar efforts aimed at
  resource modeling
   – Aligned high level models with NIF, BRO, VIVO
   – Service, instrument (device) implemented in OBI and reused by
     NIF and eagle-i
   – Currently working on coordinated representation of reagents,
     biospecimens, and genotype information

   Challenges
   – Process is time consuming and it requires extra implementation
     efforts
        • Implement and import back from reference ontologies

   – Application ontologies have peculiar requirements
        • Example: Service hierarchy in eagle-i based on type of process rather
          than input and output of the process (OBI)


   consortium
Best practices and tools
 Need of best practices and tools for
  – Reusing/reference existent ontologies
       • Ontofox, OWL module extractor, NCBO extractor service

  – Have tools integrated in ontology editors (Protégé)
       • Plugins for managing and syncing imports and MIREOTed
         terms


  – Have several “community views” or ‘slims’ that could
    be directly imported with different level of complexity




  consortium
Conclusion
 Developing an ontology-driven application has been an
  important benchmark for usage of biomedical ontologies.

 We have designed a layered set of ontologies, consisting of
  a broadly applicable core ontology and application-specific
  module
    Requirements and principles to inform a general design pattern
     for building applications that rely on ontologies for their logic
     and user interface

 Future steps
    refining and documenting these requirements
    sharing our lessons learned
    engaging in efforts addressing the issues we have exprienced


   consortium
eagle-i core module:
 http://code.google.com/p/eagle-i/

  Carlo Torniai
torniai@ohsu.edu



  consortium

Más contenido relacionado

Similar a Torniai icbo

Architecting Smarter Apps with Entity Framework
Architecting Smarter Apps with Entity FrameworkArchitecting Smarter Apps with Entity Framework
Architecting Smarter Apps with Entity FrameworkSaltmarch Media
 
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...Neuroscience Information Framework
 
Sssc2011 ontologies final
Sssc2011 ontologies finalSssc2011 ontologies final
Sssc2011 ontologies finalElena Simperl
 
Facilitating Open Science and Research Discovery via VIVO and the Semantic Web
Facilitating Open Science and Research Discovery via VIVO and the Semantic WebFacilitating Open Science and Research Discovery via VIVO and the Semantic Web
Facilitating Open Science and Research Discovery via VIVO and the Semantic WebKristi Holmes
 
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...andrea huang
 
Research Objects for e-Laboratories
Research Objects for e-LaboratoriesResearch Objects for e-Laboratories
Research Objects for e-LaboratoriesDavid Newman
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the partsCarole Goble
 
Introduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental BiologyIntroduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental BiologyBarry Smith
 
Acupulco cda access v3-1
Acupulco cda access v3-1Acupulco cda access v3-1
Acupulco cda access v3-1eyetech
 
Representation of ontology by Classified Interrelated object model
Representation of ontology by Classified Interrelated object modelRepresentation of ontology by Classified Interrelated object model
Representation of ontology by Classified Interrelated object modelMihika Shah
 
Scientific data management from the lab to the web
Scientific data management   from the lab to the webScientific data management   from the lab to the web
Scientific data management from the lab to the webJose Manuel Gómez-Pérez
 
"Ontology-centric navigation of the scientific literature"
"Ontology-centric navigation of the scientific literature""Ontology-centric navigation of the scientific literature"
"Ontology-centric navigation of the scientific literature"bridgingworlds2008
 

Similar a Torniai icbo (20)

2013-01-17 Research Object
2013-01-17 Research Object2013-01-17 Research Object
2013-01-17 Research Object
 
Architecting Smarter Apps with Entity Framework
Architecting Smarter Apps with Entity FrameworkArchitecting Smarter Apps with Entity Framework
Architecting Smarter Apps with Entity Framework
 
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
NIFSTD and NeuroLex: A Comprehensive Ontology Development Based on Multiple B...
 
bioinformatics enabling knowledge generation from agricultural omics data
bioinformatics enabling knowledge generation from agricultural omics databioinformatics enabling knowledge generation from agricultural omics data
bioinformatics enabling knowledge generation from agricultural omics data
 
Keynote at AgroLT 2008
Keynote at AgroLT 2008Keynote at AgroLT 2008
Keynote at AgroLT 2008
 
MIRIAM Resources
MIRIAM ResourcesMIRIAM Resources
MIRIAM Resources
 
Sssc2011 ontologies final
Sssc2011 ontologies finalSssc2011 ontologies final
Sssc2011 ontologies final
 
Facilitating Open Science and Research Discovery via VIVO and the Semantic Web
Facilitating Open Science and Research Discovery via VIVO and the Semantic WebFacilitating Open Science and Research Discovery via VIVO and the Semantic Web
Facilitating Open Science and Research Discovery via VIVO and the Semantic Web
 
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
Relations for Reusing (R4R) in A Shared Context: An Exploration on Research P...
 
MIREOT
MIREOTMIREOT
MIREOT
 
Research Objects for e-Laboratories
Research Objects for e-LaboratoriesResearch Objects for e-Laboratories
Research Objects for e-Laboratories
 
The VIVO Ontology Project
The VIVO Ontology ProjectThe VIVO Ontology Project
The VIVO Ontology Project
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
 
Introduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental BiologyIntroduction to Ontologies for Environmental Biology
Introduction to Ontologies for Environmental Biology
 
Acupulco cda access v3-1
Acupulco cda access v3-1Acupulco cda access v3-1
Acupulco cda access v3-1
 
Representation of ontology by Classified Interrelated object model
Representation of ontology by Classified Interrelated object modelRepresentation of ontology by Classified Interrelated object model
Representation of ontology by Classified Interrelated object model
 
Scientific data management from the lab to the web
Scientific data management   from the lab to the webScientific data management   from the lab to the web
Scientific data management from the lab to the web
 
Generating Researcher Networks with Identified Persons on a Semantic Service ...
Generating Researcher Networks with Identified Persons on a Semantic Service ...Generating Researcher Networks with Identified Persons on a Semantic Service ...
Generating Researcher Networks with Identified Persons on a Semantic Service ...
 
NETTAB 2012
NETTAB 2012NETTAB 2012
NETTAB 2012
 
"Ontology-centric navigation of the scientific literature"
"Ontology-centric navigation of the scientific literature""Ontology-centric navigation of the scientific literature"
"Ontology-centric navigation of the scientific literature"
 

Último

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 

Último (20)

MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 

Torniai icbo

  • 1. Developing an application ontology for biomedical resource annotation and retrieval: challenges and lessons learned C. Torniai, M. Brush, N. Vasilevsky, E. Segerdell, M. Wilson, T. Johnson, K. Corday, C. Shaffer and M. Haendel ICBO 2011
  • 2. Outline  eagle-i project  Aims  Ontology role  eagle-i ontology  Requirements  Implementation  Implementation choices  Challenges consortium
  • 3. eagle-i NIH funded pilot project working to make scientific resources more visible via a federated network of nine institutional repositories Index invisible resources reagents, protocols, techniques, instruments, expertise, organisms, software, training, human studies, biological specimens, etc. Enable discovery by implementing semantic relationships between resources Make data available using ontology-driven approach to research resource annotation and discovery  Facilitate development of shared semantic entities that can be referenced in publications, databases, experiments, etc. consortium
  • 4. eagle-i ontology: development drivers 1) Represent collected resource information 2) Use the ontology to control the data collection and search applications user-interface (UI) and logic 3) Build a set of ontologies that are reusable and interoperable with other ontologies and existing efforts for representing biomedical entities. - Follow OBO Foundry orthogonality principle - Best practices for biomedical ontology development - Engage in discussions within the bio-ontology and resource discovery community (alignment with similar efforts NIF, BRO, VIVO) consortium
  • 5. Ontology role in eagle-i architecture NIF, PubMedEnt Search Application rezGene Federated Network eagle- iontologies Repositories (RDF) Data Collection Application Glossary Resource Application information collection consortium
  • 6. Modeling approach • Preliminary data collection to identify key properties for each resource type • Set of 300 queries used identify relationships between resources – “Which laboratories in the United States are equipped with high- resolution ultrasound machines for brachial artery reactivity testing (BART)?” – “Find in situ hybridization protocols for whole-mount preparations of Aplysia.” • Develop initial model with a set resource classes and properties created de novo and from existent ontologies consortium
  • 7. Implementation Ontology/Method Scope/Purpose Basic Formal Ontology (BFO) Upper ontology Information Artifact Ontology Ontology metadata (IAO) Relation Ontology (RO) Common properties Minimum Information to Reuse classes and properties Represent External Ontology from external ontologies Terms (MIREOT)
  • 8. Ontology layers Goal: to decouple research resources representation from information used for application appearance and behavior Application specific module Classes, annotation properties and individuals required to drive the UIs eagle-i core ontology Classes and properties used to represent information about biomedical research resources MIREOTed ontology files Externally sourced classes and consortium properties
  • 9. eagle-i core and mireoted sources eagle-i core ontology: 1283 classes, 56 object properties, and 61 data properties. External Ontologies Purpose/subsets Classes Ontology of research material entities, Biomedical 509 processes, devices Investigations (OBI) NCBITaxonomy Organisms taxa 192 people, organization, VIVO ontology 20 publications Ontology of Clinical human study designs and 19 Research (OCRe) facets Biomdedical Resource instruments 13 Ontology (BRO) consortium
  • 10. Application-specific module Contains properties and classes required to drive the UIs of the data collection and search applications UI Annotations file Holds annotations made on core eagle-i and MIREOTed classes and properties using these annotation values and additional properties UI Annotation Definition file Definition UI annotation properties and possible instance values for these properties consortium
  • 11. Application-specific module design pattern – Values for the annotation properties ‘inClassGroup’ and ‘inPropertyGroup’ – An example of a resource annotated as a “Primary Resource Type” consortium
  • 12. Examples of ‘inClassGroup’ and ‘inPropertyGroup’ values Label Description Example Denotes classes for instrument’, Primary Resource Type which instances are ‘biospecimen’, ‘protocol’ collected Denotes classes or BFO classes such properties that are not ‘continuant’ or Data Model Exclude included in the model ‘occurrent’ or RO used for the data tool or relations such ‘precedes’ the search tool UIs Denotes a class for ‘antibody immunogen’ which instances can only created within Embedded Class be created in the ‘antibody’, ‘construct context of an insert’ created within embedding class ‘plasmid’ consortium
  • 13. Additional application-specific properties Label Description Example Property Type Used to specify the Value set to domain of an imported “OBI_0000245” eagle-i domain property. Each annotation (‘organization’) for Data Property constraint will contain the URI of one RO property class ‘location_of’’ Used to specify the range Value set to of an imported property. “ERO_0000004” eagle-i range Data Property Each annotation will (‘instrument’) for RO constraint contain the URI of one property class ‘located_in’ Defines the value of Capitalized eagle-i preferred preferred label to display ‘Organization’ for Annotation label in the data collection tool OBI_0000245 Property and search UIs (‘organization’) consortium
  • 14. Data Collection Application ‘eagle-i preferred definition’ is used for tooltips Classes annotated ‘eagle-i preferred with ‘primary label’ is used for resource type’ the display name Property annotated as ‘’primary property’ Construct insert is an Technique is example of a annotated as resource ‘referenced annotated as taxonomy’ an ‘embedded
  • 15. Challenges and benefits  Reuse of existent ontologies  Ontology Layers  Application-specific module  Community coordination and alignment  Best practices and tools consortium
  • 16. Reuse of existent ontologies Using BFO as upper level ontology and the relation ontology (RO) Advantages – Integration with other ontologies – Ease the design process – data integration and publication (Linked Open Data) Challenges – Need to exclude some classes (continuant, occurrent) from UI visualization after the inferred module has computed – Non all relevant ontologies are built using BFO – Domain and Range in RO not specified or not specific enough for an application For the future we plan to use OWL2 axiom annotation consortium
  • 17. Ontology layers Advantages – effective means to drive an application UI while maintaining interoperability with external ontologies and data sources – separate what is relevant to share with the community from what was specific to an application – facilitate parallel concurrent development Challenges – significant effort to keep the annotations current with the core module – risk of excessive proliferation of annotation properties as quick way to simplify application development complexity. consortium
  • 18. Application-specific module  We identified a common set of requirements for bridging the gap between an application and domain-specific ontologies – application-specific labels and definitions – exclusion of sets of classes and properties from the model used by the application – restriction of domain and range for some imported properties – definition of display order of object and data properties at class level consortium
  • 19. Community coordination  Commitment to collaboration with similar efforts aimed at resource modeling – Aligned high level models with NIF, BRO, VIVO – Service, instrument (device) implemented in OBI and reused by NIF and eagle-i – Currently working on coordinated representation of reagents, biospecimens, and genotype information Challenges – Process is time consuming and it requires extra implementation efforts • Implement and import back from reference ontologies – Application ontologies have peculiar requirements • Example: Service hierarchy in eagle-i based on type of process rather than input and output of the process (OBI) consortium
  • 20. Best practices and tools  Need of best practices and tools for – Reusing/reference existent ontologies • Ontofox, OWL module extractor, NCBO extractor service – Have tools integrated in ontology editors (Protégé) • Plugins for managing and syncing imports and MIREOTed terms – Have several “community views” or ‘slims’ that could be directly imported with different level of complexity consortium
  • 21. Conclusion  Developing an ontology-driven application has been an important benchmark for usage of biomedical ontologies.  We have designed a layered set of ontologies, consisting of a broadly applicable core ontology and application-specific module  Requirements and principles to inform a general design pattern for building applications that rely on ontologies for their logic and user interface  Future steps  refining and documenting these requirements  sharing our lessons learned  engaging in efforts addressing the issues we have exprienced consortium
  • 22. eagle-i core module: http://code.google.com/p/eagle-i/ Carlo Torniai torniai@ohsu.edu consortium

Notas del editor

  1. In general: worried about the usage of modules.Come back to this after more final draft is ready.Need to reconcile bullets to all look similar.  ok to have different ones for different levels, so as they all look the same.Maybe a pain, but could make the eagle-I a little smaller on each slide.
  2. Took out 2-yr. research. Mention genes, genotypes, sites of action etc. on point 2.
  3. Here probably wmphasize the degree of experiment that the thing has
  4. The architecture of the eagle-i system includes four main components:  institutional triple-store repositories; a federated network; a data collection tool, and a central search application.  In order to support semantic retrieval of resource data, the underlying data model is based on a modular set of ontologies. A unique feature of the project is that the user interface and logic of both the data collection and search tools are driven by ontologies, allowing these applications to seamlessly change in response to data-driven ontology enhancements NIF: Neuroscience infomration network
  5. We collected a prelimnary set of data Pi andPhd level biologist at each site provided the
  6. Here find a way to mention MIREOT and mostly used ontologies Maybe another and OBI and stuff like we did before
  7. representation of research resource data is decoupled from the representation of application-specific data used to control the appearance and behavior
  8. Here redo the picture: maybe combing the table we had (at the bottom) with thisLatest numbers as of July 19Ero class count =1283Erodatatype property count =56Ero object property count =61OBI class count =509OBI properties count =12SWO class count =53NCBI class count =192OCRE class count =19GO class count =1UBERON class count =2 (not includinVIVO class count =20BRO class count =13RO ObjectProperty count =18I think its hard to see how this figure fits in. potentially need some way to link these boxes together, and a bigger box around the two modules that it represents.
  9. Shorten this but this is the content The eagle-i application-specific modules contain all the properties and classes required to drive the UIs of the data collection tool and the search application. These are primarily annotation properties that tell the data and search tools how to display and interact with the ontology classes and properties to which they are attached.Summarize this:The basic design principle is to define a set of annotation properties and possible instance values for these properties in a ‘UI Annotation Definition file’ (eagle-i-app-def.owl). For example, the ‘inClassGroup’ and ‘inPropertyGroup’ annotation properties are used to tag specific classes and properties, respectively, as exhibiting certain application-related features or behavior. Table 1 shows some of the possible instance values for the ‘inClassGroup’ and ‘inPropertyGroup’ properties, Table 2 describes additional properties defined in the UI Annotation Definition file, and Fig. 4 illustrates how they control various aspects of the data collection tool UI. A second module, the ‘UI Annotations file’ (eagle-i-app.owl), holds the actual annotations made on core eagle-i classes and properties using these annotation values. These two application-specific modules have a different namespace than the core ontology, and class and property URIs Throughout this text, ‘italics’ are used to indicate a term denoting an ontology class, instance or property.Mentionthe shortcut relations (if we will ever implement that)The UI Annotations file has also been used to import external referenced classes that are used to populate drop-down menus in the data collection tool, such as MeSH terms for diseases. Or UBERON This file also contains shortcut relations between classes that in the core ontology are expressed using a more complex concatenation of properties to maintain full logical computability. For example, from an application standpoint we need to have a single property that relates a service to a core laboratory providing that service. OBI uses a composed relation built from two properties to make this association between an organization and a service it provides (‘organization’‘bearer_of’’ some ‘service provider role’ and ‘realized_by’ some ‘service’). The UI Annotations file replaces this complex statement with a single property linking a service to its provider (‘service provider’ ‘provides_service’ some ‘service’) where ‘service provider’ is defined as follows: [(‘organization’ or ‘Homo sapiens’) and (‘bearer_of’ some ‘service provider role’)]. This need to simplify complex relation chains will be a common issue in using ontologies for data collection applications, and approaches like the ones suggested in [12] should be exploited.Maybe not modules here… just two owl files. Application specific module? Also do you think we need to list the file names in the text if they are on the figure?
  10. Add a new property gorup list and fix the highlightShorten this but this is the content The eagle-i application-specific modules contain all the properties and classes required to drive the UIs of the data collection tool and the search application. These are primarily annotation properties that tell the data and search tools how to display and interact with the ontology classes and properties to which they are attached.Summarize this:The basic design principle is to define a set of annotation properties and possible instance values for these properties in a ‘UI Annotation Definition file’ (eagle-i-app-def.owl). For example, the ‘inClassGroup’ and ‘inPropertyGroup’ annotation properties are used to tag specific classes and properties, respectively, as exhibiting certain application-related features or behavior. Table 1 shows some of the possible instance values for the ‘inClassGroup’ and ‘inPropertyGroup’ properties, Table 2 describes additional properties defined in the UI Annotation Definition file, and Fig. 4 illustrates how they control various aspects of the data collection tool UI. A second module, the ‘UI Annotations file’ (eagle-i-app.owl), holds the actual annotations made on core eagle-i classes and properties using these annotation values. These two application-specific modules have a different namespace than the core ontology, and class and property URIs Throughout this text, ‘italics’ are used to indicate a term denoting an ontology class, instance or property.Mentionthe shortcut relations (if we will ever implement that)The UI Annotations file has also been used to import external referenced classes that are used to populate drop-down menus in the data collection tool, such as MeSH terms for diseases. Or UBERON This file also contains shortcut relations between classes that in the core ontology are expressed using a more complex concatenation of properties to maintain full logical computability. For example, from an application standpoint we need to have a single property that relates a service to a core laboratory providing that service. OBI uses a composed relation built from two properties to make this association between an organization and a service it provides (‘organization’‘bearer_of’’ some ‘service provider role’ and ‘realized_by’ some ‘service’). The UI Annotations file replaces this complex statement with a single property linking a service to its provider (‘service provider’ ‘provides_service’ some ‘service’) where ‘service provider’ is defined as follows: [(‘organization’ or ‘Homo sapiens’) and (‘bearer_of’ some ‘service provider role’)]. This need to simplify complex relation chains will be a common issue in using ontologies for data collection applications, and approaches like the ones suggested in [12] should be exploited.Make bigger. Why am I highlighted and not the inClassGroup?
  11. Here redo the table and have just two or three
  12. Here need to recall why we didn’t impose a set of union_of domain and range in the actual app file?Don’t know what you are referring to. Union of which domain and range? Why would a domain and range have a union?
  13. Application specific annotation in actions: Shown is an example of a ‘plasmid’ record annotated using the eagle-i ontology. (1) eagle-i classes annotated with the “resource root” value are displayed in the left bar menu. (2) The value of ‘eagle-i preferred definition’ is used for tooltips that appear while hovering over the property labels. (3) The ‘eagle-i preferred label’ is used for the display name of property. Here, the imported RO ‘location_of’ has been renamed "Location". This property is also flagged as a primary property using the ‘inPropertyGroup’ annotation property, as are ‘Additional Name’, ‘Description’ and ‘Contact Person’ properties. This flag results in presentation at the top of the property list for a record. (4) Users can select a technique associated with the reagent. In the ontology, the ‘technique’ class is annotated as a ‘referenced class’ which tells the UI to allow reference to an ontology term but create no instances. (5) Construct insert is an example of a resource annotated as an ‘embedded class’, which has to be created in the context of a construct or plasmid of which they are a part.
  14. Herethinkingif we can use a format like pro and cons I would say advantages and challengesLast item- not sure if needs to be reworded to be a “challenges and benefits”
  15. Following slides can just have the original bullet points for titles.Last point- give an example.
  16. effective means to drive an application UI while maintaining interoperability with external ontologies and data sources - Logically we wanted to separate our core ontology from the application-specific ontologies and therefore identify what was relevant to share with the community from what was specific to the needs of eagle-i. These layered modules also facilitated parallel development in a shared repository, as ontologists familiar with OWL constructs and functionality could manage eagle-i core development, while curators were able to concurrently add proper annotation values in the UI annotations file. Challenges: Despite the effectiveness of this approach, it requires significant effort to keep the annotations current when the core module changes, and presents the risk of excessive proliferation of annotation properties and their instance values in attempts to simplify application coding complexity. Title: Layered modules considerations or musings or pros and cons, challenges, ups and downs, cost benefits, the red ink and black ink, advantages and disadvantges.
  17. We have identified a set of requirements for designing modular ontologies that can bridge the gap between an application and domain-specific ontologies. These include: (a) application-specific labels and definitions; (b) exclusion of sets of classes and properties from the model used by the application; (c) restriction of domain and range for some imported properties; (d) definition of display order of object and data properties at class level. F
  18. Here have some of the genera thing about the coordinationAlignment with OBI, NIF, RDSService, instruments, genotyping information (in progress)Orthogonality and contribution back to other ontologiesCommunity views (see what to say) Probably this will be one slide each
  19. http://rest.bioontology.org/bioportal/viewextractor/Here have some of the genera thing about the coordinationAlignment with OBI, NIF, RDSService, instruments, genotyping information (in progress)Orthogonality and contribution back to other ontologiesCommunity views (see what to say) Probably this will be one slide each
  20. Shorten this
  21. Herethinkingif we can use a format like pro and con