SlideShare una empresa de Scribd logo
1 de 28
Increased Expressivity of Gene
    Ontology Annotations
  Huntley RP, Harris MA, Alam-Faruque Y, Carbon SJ,
   Dietze H, Dimmer E, Foulger R, Hill DP, Khodiyar V,
  Lock A, Lomax J, Lovering RC, Mungall CJ, Mutowo-
     Muellenet P, Sawford T, Van Auken K, Wood V
The Gene Ontology
      • A vocabulary of 37,500* distinct, connected
        descriptions that can be applied to gene
        products
                           gene 1




                           gene 2




      • That’s a lot…
              – How big is the space of possible descriptions?

*April 2013
Current descriptions miss details
• Author:
   – LMTK1 (Aatk) can negatively control axonal outgrowth in
     cortical neurons by regulating Rab11A activity in a Cdk5-
     dependent manner
          – http://www.ncbi.nlm.nih.gov/pubmed/22573681
• GO:
   – Aatk: GO:0030517 negative regulation of axon extension

• GO terms will always be a subset of total set of possible
  descriptions
   – We shouldn’t attempt to make a term for everything
• T63 Toxic effect of contact with venomous
  animals and plants

                     Term from ICD-10, a
                     hierarchical medical
                     billing code system
                     use to ‘annotate’
                     patient records
• T63 Toxic effect of contact with venomous
  animals and plants
  – T63.611 Toxic effect of contact with Portugese
    Man-o-war, accidental (unintentional)
• T63 Toxic effect of contact with venomous
  animals and plants
  – T63.611 Toxic effect of contact with Portugese
    Man-o-war, accidental (unintentional)
  – T63.612 Toxic effect of contact with Portugese
    Man-o-war, intentional self-harm
• T63 Toxic effect of contact with venomous
  animals and plants
  – T63.611 Toxic effect of contact with Portugese
    Man-o-war, accidental (unintentional)
  – T63.612 Toxic effect of contact with Portugese
    Man-o-war, intentional self-harm
  – T63.613 Toxic effect of contact with Portugese
    Man-o-war, assault
• T63 Toxic effect of contact with venomous
  animals and plants
  – T63.611 Toxic effect of contact with Portugese
    Man-o-war, accidental (unintentional)
  – T63.612 Toxic effect of contact with Portugese
    Man-o-war, intentional self-harm
  – T63.613 Toxic effect of contact with Portugese
    Man-o-war, assault
     • T63.613A Toxic effect of contact with Portugese Man-
       o-war, assault, initial encounter
     • T63.613D Toxic effect of contact with Portugese Man-
       o-war, assault, subsequent encounter
     • T63.613S Toxic effect of contact with Portugese Man-
       o-war, assault, sequela
Post-composition
    • Curators need to be able to compose their
      complex descriptions from simpler
      descriptions (terms) at the time of annotation

    •  GO annotation extensions
             • Introduced with Gene Association Format (GAF) v2
                 – Also supported in GPAD
             • Has underlying OWL description-logic model


http://www.geneontology.org/GO.format.gaf-2_0.shtml
“Classic” annotation model
    • Gene Association Format (GAF) v1
        – Simple pairwise model
        – Each gene product is associated with an (ordered) set
          of descriptions
             • Where each description == a GO term




http://www.geneontology.org/GO.format.gaf-1_0.shtml
GO annotation extensions
    • Gene Association Format (GAF) v1
        – Simple pairwise model
        – Each gene product is associated with an (ordered) set of
          descriptions
             • Where each description == a GO term
    • Gene Association Format (GAF) v2 (and GPAD)
        – Each gene product is (still) associated with an (ordered) set of
          descriptions
        – Each description is a GO term plus zero or more relationships
          to other entities
             • Entities from GO, other ontologies, databases
             • Description is an OWL anonymous class expression (aka description)
http://www.geneontology.org/GO.format.gaf-2_0.shtml
“Classic” GO annotations are
                         unconnected
                                                                                positive regulation of
                             protein                                           transcription from pol II
                          localization to                   pap1               promoter in response to
  sty1                   nucleus[GO:003                                                oxidative
                                                                                 stress[GO:0036091]
                              4504]

                           cellular response
                          to oxidative stress
                            [GO:0034599]



DB        Object            Term                Ev    Ref                 ..
PomBase   sty1              GO:0034504          IMP   PMID:9585505   ..   ..                               ..
          SPAC24B11.06c

PomBase   sty1              GO:0034599          IMP   PMID:9585505   ..   ..
          SPAC24B11.06c

PomBase   pap1              GO:0036091          IMP   PMID:9585505        ..
          SPAC1783.07c
Now with annotation extensions
                                                                                positive regulation of
                             protein                  cellular response        transcription from pol II
                          localization to            to oxidative stress       promoter in response to
                         nucleus[GO:003                [GO:0034599]                    oxidative
                                                                                 stress[GO:0036091]
                              4504]
                                                   happens
                                                   during

     sty1                                                     pap1
                                                   has
                             <anonymous
                                                   input                          <anonymous     has regulation
                             description>                                         description>
                                                                                                 target


DB        Object            Term              Ev       Ref                 Extension
PomBase   sty1              GO:0034504        IMP      PMID:9585505   ..   happens_during(GO:0034599),       ..
          SPAC24B11.06c     protein                                        has_input(SPAC1783.07c)
                            localization to
                            nucleus

PomBase   pap1              GO:0036091        IMP      PMID:9585505        has_reulation_target(…)
          SPAC1783.07c
PomBase web interface – sty1




http://www.pombase.org/spombe/result/SPAC24B11.06c
pap1




http://www.pombase.org/spombe/result/SPAC1783.07c
Where do I get them?
• Download
  – http://geneontology.org/GO.downloads.annotations.shtml
      • MGI (22,000)
      • GOA Human (4,200)
      • PomBase (1,588)
• Search and Browsing
  – Cross-species
      • AmiGO 2 – http://amigo2.berkeleybop.org - poster#57
      • QuickGO (later this year) - http://www.ebi.ac.uk/QuickGO/
  – MOD interfaces
      • PomBase – http://bombase.org
Query tool support: AmiGO 2
                                       Annotation extensions make use
                                       of other ontologies
                                       • CHEBI
                                       • CL – cell types
                                       • Uberon – metazoan anatomy
                                       • MA – mouse anatomy
                                       • EMAP – mouse anatomy
                                       • ….




                                  CL
– http://amigo2.berkeleybop.org
CL, Uberon
– http://amigo2.berkeleybop.org
CL, Uberon
– http://amigo2.berkeleybop.org
Curation tool support
• Supported in
  – Protein2GO (GOA, WormBase) [poster#97]
  – CANTO (PomBase) [poster#110]
  – MGI curation tool
Analysis tool support
• Currently: Enrichment tools do not yet support
  annotation extensions
  – Annotation extensions can be folded into an
    analysis ontology - http://galaxy.berkeleybop.org
• Future: Analysis tools can use extended
  annotations to their benefit
  – E.g. account for other modes of regulation in their
    model
  – Tool developers: contact us!
Challenge: pre vs post composition
  • Curator question: do I…
       – Request a pre-composed term via TermGenie[*]?
       – Post-compose using annotation extensions?




See Heiko’s TermGenie talk tomorrow & poster #33
Challenge: pre vs post composition
    • Curator question: do I…
         – Request a pre-composed term via TermGenie?
         – Post-compose using annotation extensions?

    • From a computational                                     protein localization to
                                                               nucleus[GO:0034504]
      perspective:
         – It doesn’t matter, we’re                                     ≡
           using OWL                                                           end_location
                                                          protein
         – 40% of GO terms have OWL                     localization    ⊓
                                                                               Nucleus
                                                                             [GO:0005634
           equivalence axioms                          [GO:0008104]               ]


http://code.google.com/p/owltools/wiki/AnnotationExtensionFolding
Curation Challenges
• Manual Curation
  – Fewer terms, but more degrees of freedom
  – Curator consistency
     • OWL constraints can help
• Automated annotation
  – Phylogenetic propagation
  – Text processing and NLP
Similar approaches and future
               directions
• Post-composition has been used extensively
  for phenotype annotation
  – ZFIN [poster#95]
  – Phenoscape [next talk]
• Future:
  – A more expressive model that bridges GO with
    pathway representations
Conclusions
• Description space is huge
  – Context is important
  – Not appropriate to make a term for everything
  – OWL allows us to mix and match pre and post
    composition
• Number of extension annotations is growing
• Annotation extensions represent untapped
  opportunity for tool developers
Acknowledgments
• GO Consortium, model organism and UniProtKB curators
• GO Directors
• PomBase developers:
   – Mark McDowell, Kim Rutherford

• Funding
   –   GO Consortium NIH 5P41HG002273-09
   –   UniProtKB GOA NHGRI U41HG006104-03
   –   British Heart Foundation grant SP/07/007/23671
   –   Kidney Research UK RP26/2008
   –   PomBase - Wellcome Trust WT090548MA
   –   MGD NHGRI HG000330

Más contenido relacionado

Similar a Increased Expressivity of Gene Ontology Annotations - Biocuration 2013

Cross Product Extensions to the Gene Ontology
Cross Product Extensions to the Gene OntologyCross Product Extensions to the Gene Ontology
Cross Product Extensions to the Gene OntologyChris Mungall
 
Workflows supporting drug discovery against malaria
Workflows supporting drug discovery against malariaWorkflows supporting drug discovery against malaria
Workflows supporting drug discovery against malariaBarry Hardy
 
PomBase conventions for improving annotation depth, breadth, consistency and ...
PomBase conventions for improving annotation depth, breadth, consistency and ...PomBase conventions for improving annotation depth, breadth, consistency and ...
PomBase conventions for improving annotation depth, breadth, consistency and ...Valerie Wood
 
Avacta Life Sciences Affimers Presentation Global Protein Engineering Summit ...
Avacta Life Sciences Affimers Presentation Global Protein Engineering Summit ...Avacta Life Sciences Affimers Presentation Global Protein Engineering Summit ...
Avacta Life Sciences Affimers Presentation Global Protein Engineering Summit ...AvactaLifeSciences
 
Translating research data into Gene Ontology annotations
Translating research data into Gene Ontology annotationsTranslating research data into Gene Ontology annotations
Translating research data into Gene Ontology annotationsPascale Gaudet
 
Modeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologiesModeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologiesChris Mungall
 
SureChEMBL patent annotations in Open PHACTS
SureChEMBL patent annotations in Open PHACTSSureChEMBL patent annotations in Open PHACTS
SureChEMBL patent annotations in Open PHACTSGeorge Papadatos
 
The Language of the Gene Ontology
The Language of the Gene OntologyThe Language of the Gene Ontology
The Language of the Gene Ontologyrobertstevens65
 
Inferring microbial gene function from evolution of synonymous codon usage bi...
Inferring microbial gene function from evolution of synonymous codon usage bi...Inferring microbial gene function from evolution of synonymous codon usage bi...
Inferring microbial gene function from evolution of synonymous codon usage bi...Fran Supek
 
caron.ppt educate the patient on the uses
caron.ppt educate the patient on the usescaron.ppt educate the patient on the uses
caron.ppt educate the patient on the usesomar97227
 
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot..."Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...Davide Chicco
 
2015.04.08-Next-generation-sequencing-issues
2015.04.08-Next-generation-sequencing-issues2015.04.08-Next-generation-sequencing-issues
2015.04.08-Next-generation-sequencing-issuesDongyan Zhao
 
Autophagy Research Focus by Proteintech
Autophagy Research Focus by ProteintechAutophagy Research Focus by Proteintech
Autophagy Research Focus by ProteintechProteintech Group
 
(November 30, 2022) Webinar: Molecular Mechanisms Behind Lameness in Meat Chi...
(November 30, 2022) Webinar: Molecular Mechanisms Behind Lameness in Meat Chi...(November 30, 2022) Webinar: Molecular Mechanisms Behind Lameness in Meat Chi...
(November 30, 2022) Webinar: Molecular Mechanisms Behind Lameness in Meat Chi...Scintica Instrumentation
 

Similar a Increased Expressivity of Gene Ontology Annotations - Biocuration 2013 (20)

Cross Product Extensions to the Gene Ontology
Cross Product Extensions to the Gene OntologyCross Product Extensions to the Gene Ontology
Cross Product Extensions to the Gene Ontology
 
Workflows supporting drug discovery against malaria
Workflows supporting drug discovery against malariaWorkflows supporting drug discovery against malaria
Workflows supporting drug discovery against malaria
 
PomBase conventions for improving annotation depth, breadth, consistency and ...
PomBase conventions for improving annotation depth, breadth, consistency and ...PomBase conventions for improving annotation depth, breadth, consistency and ...
PomBase conventions for improving annotation depth, breadth, consistency and ...
 
Avacta Life Sciences Affimers Presentation Global Protein Engineering Summit ...
Avacta Life Sciences Affimers Presentation Global Protein Engineering Summit ...Avacta Life Sciences Affimers Presentation Global Protein Engineering Summit ...
Avacta Life Sciences Affimers Presentation Global Protein Engineering Summit ...
 
Translating research data into Gene Ontology annotations
Translating research data into Gene Ontology annotationsTranslating research data into Gene Ontology annotations
Translating research data into Gene Ontology annotations
 
Modeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologiesModeling exposure events and adverse outcome pathways using ontologies
Modeling exposure events and adverse outcome pathways using ontologies
 
Xerox2009
Xerox2009Xerox2009
Xerox2009
 
RML NCBI Resources
RML NCBI ResourcesRML NCBI Resources
RML NCBI Resources
 
SureChEMBL patent annotations in Open PHACTS
SureChEMBL patent annotations in Open PHACTSSureChEMBL patent annotations in Open PHACTS
SureChEMBL patent annotations in Open PHACTS
 
The Language of the Gene Ontology
The Language of the Gene OntologyThe Language of the Gene Ontology
The Language of the Gene Ontology
 
PHd defense presentation Final RIVES
PHd defense presentation Final RIVESPHd defense presentation Final RIVES
PHd defense presentation Final RIVES
 
Chicago stats talk
Chicago stats talkChicago stats talk
Chicago stats talk
 
Paprica course
Paprica coursePaprica course
Paprica course
 
Inferring microbial gene function from evolution of synonymous codon usage bi...
Inferring microbial gene function from evolution of synonymous codon usage bi...Inferring microbial gene function from evolution of synonymous codon usage bi...
Inferring microbial gene function from evolution of synonymous codon usage bi...
 
caron.ppt educate the patient on the uses
caron.ppt educate the patient on the usescaron.ppt educate the patient on the uses
caron.ppt educate the patient on the uses
 
Ismb2009
Ismb2009Ismb2009
Ismb2009
 
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot..."Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
 
2015.04.08-Next-generation-sequencing-issues
2015.04.08-Next-generation-sequencing-issues2015.04.08-Next-generation-sequencing-issues
2015.04.08-Next-generation-sequencing-issues
 
Autophagy Research Focus by Proteintech
Autophagy Research Focus by ProteintechAutophagy Research Focus by Proteintech
Autophagy Research Focus by Proteintech
 
(November 30, 2022) Webinar: Molecular Mechanisms Behind Lameness in Meat Chi...
(November 30, 2022) Webinar: Molecular Mechanisms Behind Lameness in Meat Chi...(November 30, 2022) Webinar: Molecular Mechanisms Behind Lameness in Meat Chi...
(November 30, 2022) Webinar: Molecular Mechanisms Behind Lameness in Meat Chi...
 

Más de Chris Mungall

MADICES Mungall 2022.pptx
MADICES Mungall 2022.pptxMADICES Mungall 2022.pptx
MADICES Mungall 2022.pptxChris Mungall
 
Scaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciencesScaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciencesChris Mungall
 
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOLinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOChris Mungall
 
Ontology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptxOntology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptxChris Mungall
 
LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)Chris Mungall
 
LinkML presentation to Yosemite Group
LinkML presentation to Yosemite GroupLinkML presentation to Yosemite Group
LinkML presentation to Yosemite GroupChris Mungall
 
Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...Chris Mungall
 
All together now: piecing together the knowledge graph of life
All together now: piecing together the knowledge graph of lifeAll together now: piecing together the knowledge graph of life
All together now: piecing together the knowledge graph of lifeChris Mungall
 
Collaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeCollaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeChris Mungall
 
Representation of kidney structures in Uberon
Representation of kidney structures in UberonRepresentation of kidney structures in Uberon
Representation of kidney structures in UberonChris Mungall
 
SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)Chris Mungall
 
Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019Chris Mungall
 
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...Chris Mungall
 
Uberon: opening up to community contributions
Uberon: opening up to community contributionsUberon: opening up to community contributions
Uberon: opening up to community contributionsChris Mungall
 
Causal reasoning using the Relation Ontology
Causal reasoning using the Relation OntologyCausal reasoning using the Relation Ontology
Causal reasoning using the Relation OntologyChris Mungall
 
US2TS presentation on Gene Ontology
US2TS presentation on Gene OntologyUS2TS presentation on Gene Ontology
US2TS presentation on Gene OntologyChris Mungall
 
Introduction to the BioLink datamodel
Introduction to the BioLink datamodelIntroduction to the BioLink datamodel
Introduction to the BioLink datamodelChris Mungall
 
Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015Chris Mungall
 
Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017Chris Mungall
 

Más de Chris Mungall (20)

MADICES Mungall 2022.pptx
MADICES Mungall 2022.pptxMADICES Mungall 2022.pptx
MADICES Mungall 2022.pptx
 
Scaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciencesScaling up semantics; lessons learned across the life sciences
Scaling up semantics; lessons learned across the life sciences
 
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODOLinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
LinkML Intro July 2022.pptx PLEASE VIEW THIS ON ZENODO
 
Ontology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptxOntology Access Kit_ Workshop Intro Slides.pptx
Ontology Access Kit_ Workshop Intro Slides.pptx
 
LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)LinkML Intro (for Monarch devs)
LinkML Intro (for Monarch devs)
 
LinkML presentation to Yosemite Group
LinkML presentation to Yosemite GroupLinkML presentation to Yosemite Group
LinkML presentation to Yosemite Group
 
Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...Experiences in the biosciences with the open biological ontologies foundry an...
Experiences in the biosciences with the open biological ontologies foundry an...
 
All together now: piecing together the knowledge graph of life
All together now: piecing together the knowledge graph of lifeAll together now: piecing together the knowledge graph of life
All together now: piecing together the knowledge graph of life
 
Collaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of LifeCollaboratively Creating the Knowledge Graph of Life
Collaboratively Creating the Knowledge Graph of Life
 
Representation of kidney structures in Uberon
Representation of kidney structures in UberonRepresentation of kidney structures in Uberon
Representation of kidney structures in Uberon
 
SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)SparqlProg (BioHackathon 2019)
SparqlProg (BioHackathon 2019)
 
Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019Ontology Development Kit: Bio-Ontologies 2019
Ontology Development Kit: Bio-Ontologies 2019
 
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
US2TS: Reasoning over multiple open bio-ontologies to make machines and human...
 
Uberon: opening up to community contributions
Uberon: opening up to community contributionsUberon: opening up to community contributions
Uberon: opening up to community contributions
 
Causal reasoning using the Relation Ontology
Causal reasoning using the Relation OntologyCausal reasoning using the Relation Ontology
Causal reasoning using the Relation Ontology
 
US2TS presentation on Gene Ontology
US2TS presentation on Gene OntologyUS2TS presentation on Gene Ontology
US2TS presentation on Gene Ontology
 
Introduction to the BioLink datamodel
Introduction to the BioLink datamodelIntroduction to the BioLink datamodel
Introduction to the BioLink datamodel
 
Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015Computing on Phenotypes AMP 2015
Computing on Phenotypes AMP 2015
 
ENVO GSC 2015
ENVO GSC 2015ENVO GSC 2015
ENVO GSC 2015
 
Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017Mungall keynote-biocurator-2017
Mungall keynote-biocurator-2017
 

Último

Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 

Último (20)

Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 

Increased Expressivity of Gene Ontology Annotations - Biocuration 2013

  • 1. Increased Expressivity of Gene Ontology Annotations Huntley RP, Harris MA, Alam-Faruque Y, Carbon SJ, Dietze H, Dimmer E, Foulger R, Hill DP, Khodiyar V, Lock A, Lomax J, Lovering RC, Mungall CJ, Mutowo- Muellenet P, Sawford T, Van Auken K, Wood V
  • 2. The Gene Ontology • A vocabulary of 37,500* distinct, connected descriptions that can be applied to gene products gene 1 gene 2 • That’s a lot… – How big is the space of possible descriptions? *April 2013
  • 3.
  • 4. Current descriptions miss details • Author: – LMTK1 (Aatk) can negatively control axonal outgrowth in cortical neurons by regulating Rab11A activity in a Cdk5- dependent manner – http://www.ncbi.nlm.nih.gov/pubmed/22573681 • GO: – Aatk: GO:0030517 negative regulation of axon extension • GO terms will always be a subset of total set of possible descriptions – We shouldn’t attempt to make a term for everything
  • 5. • T63 Toxic effect of contact with venomous animals and plants Term from ICD-10, a hierarchical medical billing code system use to ‘annotate’ patient records
  • 6. • T63 Toxic effect of contact with venomous animals and plants – T63.611 Toxic effect of contact with Portugese Man-o-war, accidental (unintentional)
  • 7. • T63 Toxic effect of contact with venomous animals and plants – T63.611 Toxic effect of contact with Portugese Man-o-war, accidental (unintentional) – T63.612 Toxic effect of contact with Portugese Man-o-war, intentional self-harm
  • 8. • T63 Toxic effect of contact with venomous animals and plants – T63.611 Toxic effect of contact with Portugese Man-o-war, accidental (unintentional) – T63.612 Toxic effect of contact with Portugese Man-o-war, intentional self-harm – T63.613 Toxic effect of contact with Portugese Man-o-war, assault
  • 9. • T63 Toxic effect of contact with venomous animals and plants – T63.611 Toxic effect of contact with Portugese Man-o-war, accidental (unintentional) – T63.612 Toxic effect of contact with Portugese Man-o-war, intentional self-harm – T63.613 Toxic effect of contact with Portugese Man-o-war, assault • T63.613A Toxic effect of contact with Portugese Man- o-war, assault, initial encounter • T63.613D Toxic effect of contact with Portugese Man- o-war, assault, subsequent encounter • T63.613S Toxic effect of contact with Portugese Man- o-war, assault, sequela
  • 10. Post-composition • Curators need to be able to compose their complex descriptions from simpler descriptions (terms) at the time of annotation •  GO annotation extensions • Introduced with Gene Association Format (GAF) v2 – Also supported in GPAD • Has underlying OWL description-logic model http://www.geneontology.org/GO.format.gaf-2_0.shtml
  • 11. “Classic” annotation model • Gene Association Format (GAF) v1 – Simple pairwise model – Each gene product is associated with an (ordered) set of descriptions • Where each description == a GO term http://www.geneontology.org/GO.format.gaf-1_0.shtml
  • 12. GO annotation extensions • Gene Association Format (GAF) v1 – Simple pairwise model – Each gene product is associated with an (ordered) set of descriptions • Where each description == a GO term • Gene Association Format (GAF) v2 (and GPAD) – Each gene product is (still) associated with an (ordered) set of descriptions – Each description is a GO term plus zero or more relationships to other entities • Entities from GO, other ontologies, databases • Description is an OWL anonymous class expression (aka description) http://www.geneontology.org/GO.format.gaf-2_0.shtml
  • 13. “Classic” GO annotations are unconnected positive regulation of protein transcription from pol II localization to pap1 promoter in response to sty1 nucleus[GO:003 oxidative stress[GO:0036091] 4504] cellular response to oxidative stress [GO:0034599] DB Object Term Ev Ref .. PomBase sty1 GO:0034504 IMP PMID:9585505 .. .. .. SPAC24B11.06c PomBase sty1 GO:0034599 IMP PMID:9585505 .. .. SPAC24B11.06c PomBase pap1 GO:0036091 IMP PMID:9585505 .. SPAC1783.07c
  • 14. Now with annotation extensions positive regulation of protein cellular response transcription from pol II localization to to oxidative stress promoter in response to nucleus[GO:003 [GO:0034599] oxidative stress[GO:0036091] 4504] happens during sty1 pap1 has <anonymous input <anonymous has regulation description> description> target DB Object Term Ev Ref Extension PomBase sty1 GO:0034504 IMP PMID:9585505 .. happens_during(GO:0034599), .. SPAC24B11.06c protein has_input(SPAC1783.07c) localization to nucleus PomBase pap1 GO:0036091 IMP PMID:9585505 has_reulation_target(…) SPAC1783.07c
  • 15. PomBase web interface – sty1 http://www.pombase.org/spombe/result/SPAC24B11.06c
  • 17. Where do I get them? • Download – http://geneontology.org/GO.downloads.annotations.shtml • MGI (22,000) • GOA Human (4,200) • PomBase (1,588) • Search and Browsing – Cross-species • AmiGO 2 – http://amigo2.berkeleybop.org - poster#57 • QuickGO (later this year) - http://www.ebi.ac.uk/QuickGO/ – MOD interfaces • PomBase – http://bombase.org
  • 18. Query tool support: AmiGO 2 Annotation extensions make use of other ontologies • CHEBI • CL – cell types • Uberon – metazoan anatomy • MA – mouse anatomy • EMAP – mouse anatomy • …. CL – http://amigo2.berkeleybop.org
  • 21. Curation tool support • Supported in – Protein2GO (GOA, WormBase) [poster#97] – CANTO (PomBase) [poster#110] – MGI curation tool
  • 22. Analysis tool support • Currently: Enrichment tools do not yet support annotation extensions – Annotation extensions can be folded into an analysis ontology - http://galaxy.berkeleybop.org • Future: Analysis tools can use extended annotations to their benefit – E.g. account for other modes of regulation in their model – Tool developers: contact us!
  • 23. Challenge: pre vs post composition • Curator question: do I… – Request a pre-composed term via TermGenie[*]? – Post-compose using annotation extensions? See Heiko’s TermGenie talk tomorrow & poster #33
  • 24. Challenge: pre vs post composition • Curator question: do I… – Request a pre-composed term via TermGenie? – Post-compose using annotation extensions? • From a computational protein localization to nucleus[GO:0034504] perspective: – It doesn’t matter, we’re ≡ using OWL end_location protein – 40% of GO terms have OWL localization ⊓ Nucleus [GO:0005634 equivalence axioms [GO:0008104] ] http://code.google.com/p/owltools/wiki/AnnotationExtensionFolding
  • 25. Curation Challenges • Manual Curation – Fewer terms, but more degrees of freedom – Curator consistency • OWL constraints can help • Automated annotation – Phylogenetic propagation – Text processing and NLP
  • 26. Similar approaches and future directions • Post-composition has been used extensively for phenotype annotation – ZFIN [poster#95] – Phenoscape [next talk] • Future: – A more expressive model that bridges GO with pathway representations
  • 27. Conclusions • Description space is huge – Context is important – Not appropriate to make a term for everything – OWL allows us to mix and match pre and post composition • Number of extension annotations is growing • Annotation extensions represent untapped opportunity for tool developers
  • 28. Acknowledgments • GO Consortium, model organism and UniProtKB curators • GO Directors • PomBase developers: – Mark McDowell, Kim Rutherford • Funding – GO Consortium NIH 5P41HG002273-09 – UniProtKB GOA NHGRI U41HG006104-03 – British Heart Foundation grant SP/07/007/23671 – Kidney Research UK RP26/2008 – PomBase - Wellcome Trust WT090548MA – MGD NHGRI HG000330

Notas del editor

  1. 10 mins. GAF2.0
  2. 1
  3. Sweet spot in a large galaxy
  4. Not ad-hoc – OWL description
  5. Key point: logically equivalent to an annotation to a term in the &lt;anon desc&gt; box, with the same links out.