SlideShare una empresa de Scribd logo
1 de 27
Functional Classification of
Environmental Reads using Gene Ontology




                       Daniel C. Richter
                       Daniel H. Huson

               Dept. Algorithms in Bioinformatics
                ZBIT Center for Bioinformatics
               University of Tuebingen, Germany

              www-ab.informatik.uni-tuebingen.de
Metagenomics - Workflow


                                           Environmental Sample


                                                                         Sequencing (Sanger/NGS)




               Who is out there?            How many are there?                         What are they doing?




       Taxonomical Analysis                 Quantitive Analysis                         Functional Analysis


                                                     MEGAN
                                                     Software
Daniel Richter – University of Tuebingen    Functional Metagenome Analysis                         Stockholm, 09/06/27 [01]
Metagenomics - Workflow


                                           Environmental Sample


                                                                         Sequencing (Sanger/NGS)




               Who is out there?            How many are there?                         What are they doing?




       Taxonomical Analysis                 Quantitive Analysis                         Functional Analysis


                                                     MEGAN
                                                     Software
Daniel Richter – University of Tuebingen    Functional Metagenome Analysis                         Stockholm, 09/06/27 [01]
MEGAN – Taxonomical Analysis
                                                 Precomputation

              Reads


                                           nr

              BLAST                        nt
                                           ...




                                                                      „Laptop
                                                 MEGAN                Analysis“




     NCBI Taxonomy
  • >460.000 taxa
  • Taxonomical Ranks:
  Kingdom, Phylum, Class,
  Order,..., Species
                                                      Huson et al., 2007, Genome Research
Daniel Richter – University of Tuebingen                  Functional Metagenome Analysis    Stockholm, 09/06/27 [02]
Functional Metagenome Analysis
 
     Extension of MEGAN to classify reads according to their function
 • Input: BLASTX result file → homology-based approach
 • Structured and interactive overview of gene products



                                                http://www.geneontology.org


  widely used in biological databases, gene expression
          and annotation studies

  >27000 GO terms (cross-specific)

                                                                                  DAG
  three structured vocabularies (ontologies)
       molecular function
       biological process
       cellular component
Daniel Richter – University of Tuebingen   Functional Metagenome Analysis     Stockholm, 09/06/27 [03]
Mapping BLAST Matches to GO Terms

    >gb|EAU86868.1| predicted protein [Coprinopsis cinerea okayama7#130]
    >emb|CAC86119.1| putative hexose-6-phosphate transporter [Listeria monocytogenes]
    >ref|ZP_00390013.1| Arabinose efflux permease [Bacillus anthracis str. A2012]




                                                ref2go map
                                           RefSeqID →
                                                                                 UniProt mapping
                                                                GO Terms

                                           RefSeqID →           GO Terms

                                           RefSeqID →           GO Terms         http://pir.georgetown.edu/


                                           RefSeqID →           GO Terms
                                           ...


                                                >3.5 Mio entries


            GO:0044408
            GO:0043581
            GO:0032502


Daniel Richter – University of Tuebingen        Functional Metagenome Analysis               Stockholm, 09/06/27 [04]
Placing Reads onto GO Terms – LCA Approach
                                                                                                r
                                                                     ar           al          la t
                                                                  ul           ic          l u en
                                                                ec tion      og ess     e l on
                                                                                       C p
                BLAST                      ref2go map        ol           ol
                                                            M un c
                                                             F
                                                                        Bi roc
                                                                          P            Co
                                                                                          m          GO Terms
                                M0                                                                      protein binding

                                M1                                                                      response to stress
                                                                                                        signal transduction
   Read                         M2
                                                                                                        cell communication
                                M3
                                                                                                        nucleus
                                M4                                                                      cell part
                                                                                                        cytosol
                                           Placement:           ?             ?            ?




Daniel Richter – University of Tuebingen                Functional Metagenome Analysis                    Stockholm, 09/06/27 [05]
Placing Reads onto GO Terms – LCA Approach
                                                                                                r
                                                                     ar           al          la t
                                                                  ul           ic          l u en
                                                                ec tion      og ess     e l on
                                                                                       C p
                BLAST                      ref2go map        ol           ol
                                                            M un c      Bi roc
                                                                          P            Co
                                                                                          m          GO Terms
                                                             F

                                M0                                                                      protein binding

                                M1                                                                      response to stress
                                                                                                        signal transduction
   Read                         M2
                                                                                                        cell communication
                                M3
                                                                                                        nucleus
                                M4                                                                      cell part
                                                                                                        cytosol
                                           Placement:                         ?            ?




Daniel Richter – University of Tuebingen                Functional Metagenome Analysis                    Stockholm, 09/06/27 [06]
Placing Reads onto GO Terms – LCA Approach
                                                                                                                  r
                                                                                      ar           al           la t
                                                                                   ul           ic           l u en
                                                                                 ec tion      og ess      e l on
                                                                                                         C p
                BLAST                              ref2go map                 ol           ol
                                                                             M un c      Bi roc
                                                                                           P             Co
                                                                                                            m                            GO Terms
                                                                              F

                                M0                                                                                                                protein binding

                                M1                                                                                                                response to stress
                                                                                                                                                  signal transduction
   Read                         M2
                                                                                                                                                  cell communication
                                M3
                                                                                                                                                  nucleus
                                M4                                                                                                                cell part
                                                                                                                                                  cytosol
                                                   Placement:                                  ?              ?



                                                   root                                                       root                                root



                                                                                                                                                           cellular
                                                                                                                                                           process




                                                            cell
                                                        communication




                                       response              signal                               response                signal      response              signal
                                       to stress          transduction                            to stress            transduction   to stress          transduction




Daniel Richter – University of Tuebingen                                 Functional Metagenome Analysis                                             Stockholm, 09/06/27 [07]
Placing Reads onto GO Terms – LCA Approach
                                                                                                                  r
                                                                                      ar           al           la t
                                                                                   ul           ic           l u en
                                                                                 ec tion      og ess      e l on
                                                                                                         C p
                BLAST                              ref2go map                 ol           ol
                                                                             M un c      Bi roc
                                                                                           P             Co
                                                                                                            m                            GO Terms
                                                                              F

                                M0                                                                                                                protein binding

                                M1                                                                                                                response to stress
                                                                                                                                                  signal transduction
   Read                         M2
                                                                                                                                                  cell communication
                                M3
                                                                                                                                                  nucleus
                                M4                                                                                                                cell part
                                                                                                                                                  cytosol
                                                   Placement:                                                 ?



                                                   root                                                       root                                root



                                                                                                                                                           cellular
                                                                                                                                                           process




                                                            cell
                                                        communication




                                       response              signal                               response                signal      response              signal
                                       to stress          transduction                            to stress            transduction   to stress          transduction




Daniel Richter – University of Tuebingen                                 Functional Metagenome Analysis                                             Stockholm, 09/06/27 [08]
Placing Reads onto GO Terms – LCA Approach
                                                                                                r
                                                                     ar           al          la t
                                                                  ul           ic          l u en
                                                                ec tion      og ess     e l on
                                                                                       C p
                BLAST                      ref2go map        ol           ol
                                                            M un c      Bi roc
                                                                          P            Co
                                                                                          m          GO Terms
                                                             F

                                M0                                                                      protein binding

                                M1                                                                      response to stress
                                                                                                        signal transduction
   Read                         M2
                                                                                                        cell communication
                                M3
                                                                                                        nucleus
                                M4                                                                      cell part
                                                                                                        cytosol
                                           Placement:




Daniel Richter – University of Tuebingen                Functional Metagenome Analysis                    Stockholm, 09/06/27 [09]
Benefits and Drawbacks of the LCA Algorithm
                   • loss of accuracy: LCA is always less specific
                   • might miss gene products of interest (losing the „big picture“)
                   • reads with many different BLAST matches (= many GO terms)
                   are likely to be assigned to high level GO terms

              • complexity reduction facilitates analysis and visual inspection
              • memory efficient:
                       • need to store only three integers (GO IDs) per read
                       • applicable to large data sets: 5 Mio reads, 760 GB BLAST output
              • loss of accuracy ≠loss of correctness (avoids false-positives)
                                → balance between usability and accuracy
                                           Calculation example „Full Approach“:

                                           1,000,000 reads
                                                      each read: 50 BLAST matches
                                                      each match: 10 GO terms
                                                      → 500,000,000 GO IDs
Daniel Richter – University of Tuebingen             Functional Metagenome Analysis   Stockholm, 09/06/27 [10]
GO Analyzer – Main Window




Daniel Richter – University of Tuebingen   Functional Metagenome Analysis   Stockholm, 09/06/27 [11]
GO Analyzer – Main Window




Daniel Richter – University of Tuebingen   Functional Metagenome Analysis   Stockholm, 09/06/27 [12]
GO Analyzer – Main Window




Daniel Richter – University of Tuebingen   Functional Metagenome Analysis   Stockholm, 09/06/27 [13]
GO Analyzer – Main Window




Daniel Richter – University of Tuebingen   Functional Metagenome Analysis   Stockholm, 09/06/27 [14]
GO Analyzer – Main Window




Daniel Richter – University of Tuebingen   Functional Metagenome Analysis   Stockholm, 09/06/27 [15]
GO Analyzer – Main Window




Daniel Richter – University of Tuebingen   Functional Metagenome Analysis   Stockholm, 09/06/27 [16]
GO Analyzer – Main Window




Daniel Richter – University of Tuebingen   Functional Metagenome Analysis   Stockholm, 09/06/27 [17]
GO Analyzer – Main Window




                                           Extract reads



Daniel Richter – University of Tuebingen       Functional Metagenome Analysis   Stockholm, 09/06/27 [18]
GO Analyzer – Path Highlighting




Daniel Richter – University of Tuebingen   Functional Metagenome Analysis   Stockholm, 09/06/27 [19]
GO Analyzer – GO Slims

 Gene Ontology provides subsets of GO terms
          → useful for high level view of the three ontologies




 http://www.geneontology.org/GO.slims.shtml




                                     Design your own metagenomic GO slim...
Daniel Richter – University of Tuebingen        Functional Metagenome Analysis   Stockholm, 09/06/27 [20]
GO Analyzer – Comparison View




Daniel Richter – University of Tuebingen   Functional Metagenome Analysis   Stockholm, 09/06/27 [21]
GO Analyzer – Comparison View




Daniel Richter – University of Tuebingen   Functional Metagenome Analysis   Stockholm, 09/06/27 [22]
GO Analyzer – Comparison View




Daniel Richter – University of Tuebingen   Functional Metagenome Analysis   Stockholm, 09/06/27 [23]
GO Analyzer – Comparison View




Daniel Richter – University of Tuebingen   Functional Metagenome Analysis   Stockholm, 09/06/27 [24]
GO Analyzer – Summary


   • New module of MEGAN 4 to conduct functional analyses on
                 environmental reads
        „BLAST only once, perform taxonomical and functional analysis in one step“
   • Homology-based approach
   • Overview tool: visual and interactive exploration of gene products
   • Inspection, extraction and chart features
   • Comparative mode




         Installers for all operating systems will be available from:

             http://www-ab.informatik.uni-tuebingen.de/software/megan
Daniel Richter – University of Tuebingen   Functional Metagenome Analysis   Stockholm, 09/06/27 [25]

Más contenido relacionado

Similar a Functional Metagenome Analysis using Gene Ontology (MEGAN 4)

Inverse Mixed-Solvent Molecular Dynamics for Visualization of Amino Acid Resi...
Inverse Mixed-Solvent Molecular Dynamics for Visualization of Amino Acid Resi...Inverse Mixed-Solvent Molecular Dynamics for Visualization of Amino Acid Resi...
Inverse Mixed-Solvent Molecular Dynamics for Visualization of Amino Acid Resi...
Keisuke Yanagisawa
 
Pharmaday Verona Sxf 13112008 A
Pharmaday Verona Sxf 13112008 APharmaday Verona Sxf 13112008 A
Pharmaday Verona Sxf 13112008 A
sxf2365
 
Icam 2009 Soft Actuators H Bridges Nanopores
Icam 2009 Soft Actuators H Bridges NanoporesIcam 2009 Soft Actuators H Bridges Nanopores
Icam 2009 Soft Actuators H Bridges Nanopores
dickbroer
 
Christina Smolke (Stanford) at a LASER on "Synthetic Biology"
Christina Smolke (Stanford) at a LASER on "Synthetic Biology"Christina Smolke (Stanford) at a LASER on "Synthetic Biology"
Christina Smolke (Stanford) at a LASER on "Synthetic Biology"
piero scaruffi
 

Similar a Functional Metagenome Analysis using Gene Ontology (MEGAN 4) (20)

Thesis def
Thesis defThesis def
Thesis def
 
Spotlight on Analytical Applications e-Zine - Volume 11
Spotlight on Analytical Applications e-Zine - Volume 11Spotlight on Analytical Applications e-Zine - Volume 11
Spotlight on Analytical Applications e-Zine - Volume 11
 
The Computational Microscope Images Biomolecular Machines and Nanodevices - K...
The Computational Microscope Images Biomolecular Machines and Nanodevices - K...The Computational Microscope Images Biomolecular Machines and Nanodevices - K...
The Computational Microscope Images Biomolecular Machines and Nanodevices - K...
 
Connecting theory with experiment: A survey to understand the behaviour of mu...
Connecting theory with experiment: A survey to understand the behaviour of mu...Connecting theory with experiment: A survey to understand the behaviour of mu...
Connecting theory with experiment: A survey to understand the behaviour of mu...
 
Inverse Mixed-Solvent Molecular Dynamics for Visualization of Amino Acid Resi...
Inverse Mixed-Solvent Molecular Dynamics for Visualization of Amino Acid Resi...Inverse Mixed-Solvent Molecular Dynamics for Visualization of Amino Acid Resi...
Inverse Mixed-Solvent Molecular Dynamics for Visualization of Amino Acid Resi...
 
Comparative metagenomics: quantifying similarities between environments, CMBI...
Comparative metagenomics: quantifying similarities between environments, CMBI...Comparative metagenomics: quantifying similarities between environments, CMBI...
Comparative metagenomics: quantifying similarities between environments, CMBI...
 
Towards a Rapid Model Prototyping Strategy for Systems & Synthetic Biology
Towards a Rapid Model Prototyping  Strategy for Systems & Synthetic BiologyTowards a Rapid Model Prototyping  Strategy for Systems & Synthetic Biology
Towards a Rapid Model Prototyping Strategy for Systems & Synthetic Biology
 
Nanomaterials for Leather (Tamanna (2119501).pdf)
Nanomaterials for Leather (Tamanna (2119501).pdf)Nanomaterials for Leather (Tamanna (2119501).pdf)
Nanomaterials for Leather (Tamanna (2119501).pdf)
 
Function and Phenotype Prediction through Data and Knowledge Fusion
Function and Phenotype Prediction through Data and Knowledge FusionFunction and Phenotype Prediction through Data and Knowledge Fusion
Function and Phenotype Prediction through Data and Knowledge Fusion
 
peerj-cs-17
peerj-cs-17peerj-cs-17
peerj-cs-17
 
Pharmaday Verona Sxf 13112008 A
Pharmaday Verona Sxf 13112008 APharmaday Verona Sxf 13112008 A
Pharmaday Verona Sxf 13112008 A
 
Icam 2009 Soft Actuators H Bridges Nanopores
Icam 2009 Soft Actuators H Bridges NanoporesIcam 2009 Soft Actuators H Bridges Nanopores
Icam 2009 Soft Actuators H Bridges Nanopores
 
Myers CV_2015
Myers CV_2015Myers CV_2015
Myers CV_2015
 
NCBO DBP
NCBO DBPNCBO DBP
NCBO DBP
 
Christina Smolke (Stanford) at a LASER on "Synthetic Biology"
Christina Smolke (Stanford) at a LASER on "Synthetic Biology"Christina Smolke (Stanford) at a LASER on "Synthetic Biology"
Christina Smolke (Stanford) at a LASER on "Synthetic Biology"
 
Presenting and Preserving the Change in Taxonomic Knowledge for Linked Data
Presenting and Preserving the Change in Taxonomic Knowledge for Linked DataPresenting and Preserving the Change in Taxonomic Knowledge for Linked Data
Presenting and Preserving the Change in Taxonomic Knowledge for Linked Data
 
Matching Domain Ontologies A Comparative Study [Mode De Compatibilité]
Matching Domain Ontologies A Comparative Study [Mode De Compatibilité]Matching Domain Ontologies A Comparative Study [Mode De Compatibilité]
Matching Domain Ontologies A Comparative Study [Mode De Compatibilité]
 
Metabolomics Data Analysis
Metabolomics Data AnalysisMetabolomics Data Analysis
Metabolomics Data Analysis
 
Characterization Techniques for Nanoparticles
Characterization Techniques for NanoparticlesCharacterization Techniques for Nanoparticles
Characterization Techniques for Nanoparticles
 
Genome Curation using Apollo - Workshop at UTK
Genome Curation using Apollo - Workshop at UTKGenome Curation using Apollo - Workshop at UTK
Genome Curation using Apollo - Workshop at UTK
 

Último

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Último (20)

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 

Functional Metagenome Analysis using Gene Ontology (MEGAN 4)

  • 1. Functional Classification of Environmental Reads using Gene Ontology Daniel C. Richter Daniel H. Huson Dept. Algorithms in Bioinformatics ZBIT Center for Bioinformatics University of Tuebingen, Germany www-ab.informatik.uni-tuebingen.de
  • 2. Metagenomics - Workflow Environmental Sample Sequencing (Sanger/NGS) Who is out there? How many are there? What are they doing? Taxonomical Analysis Quantitive Analysis Functional Analysis MEGAN Software Daniel Richter – University of Tuebingen Functional Metagenome Analysis Stockholm, 09/06/27 [01]
  • 3. Metagenomics - Workflow Environmental Sample Sequencing (Sanger/NGS) Who is out there? How many are there? What are they doing? Taxonomical Analysis Quantitive Analysis Functional Analysis MEGAN Software Daniel Richter – University of Tuebingen Functional Metagenome Analysis Stockholm, 09/06/27 [01]
  • 4. MEGAN – Taxonomical Analysis Precomputation Reads nr BLAST nt ... „Laptop MEGAN Analysis“ NCBI Taxonomy • >460.000 taxa • Taxonomical Ranks: Kingdom, Phylum, Class, Order,..., Species Huson et al., 2007, Genome Research Daniel Richter – University of Tuebingen Functional Metagenome Analysis Stockholm, 09/06/27 [02]
  • 5. Functional Metagenome Analysis  Extension of MEGAN to classify reads according to their function • Input: BLASTX result file → homology-based approach • Structured and interactive overview of gene products http://www.geneontology.org  widely used in biological databases, gene expression and annotation studies  >27000 GO terms (cross-specific) DAG  three structured vocabularies (ontologies) molecular function biological process cellular component Daniel Richter – University of Tuebingen Functional Metagenome Analysis Stockholm, 09/06/27 [03]
  • 6. Mapping BLAST Matches to GO Terms >gb|EAU86868.1| predicted protein [Coprinopsis cinerea okayama7#130] >emb|CAC86119.1| putative hexose-6-phosphate transporter [Listeria monocytogenes] >ref|ZP_00390013.1| Arabinose efflux permease [Bacillus anthracis str. A2012] ref2go map RefSeqID → UniProt mapping GO Terms RefSeqID → GO Terms RefSeqID → GO Terms http://pir.georgetown.edu/ RefSeqID → GO Terms ... >3.5 Mio entries GO:0044408 GO:0043581 GO:0032502 Daniel Richter – University of Tuebingen Functional Metagenome Analysis Stockholm, 09/06/27 [04]
  • 7. Placing Reads onto GO Terms – LCA Approach r ar al la t ul ic l u en ec tion og ess e l on C p BLAST ref2go map ol ol M un c F Bi roc P Co m GO Terms M0 protein binding M1 response to stress signal transduction Read M2 cell communication M3 nucleus M4 cell part cytosol Placement: ? ? ? Daniel Richter – University of Tuebingen Functional Metagenome Analysis Stockholm, 09/06/27 [05]
  • 8. Placing Reads onto GO Terms – LCA Approach r ar al la t ul ic l u en ec tion og ess e l on C p BLAST ref2go map ol ol M un c Bi roc P Co m GO Terms F M0 protein binding M1 response to stress signal transduction Read M2 cell communication M3 nucleus M4 cell part cytosol Placement: ? ? Daniel Richter – University of Tuebingen Functional Metagenome Analysis Stockholm, 09/06/27 [06]
  • 9. Placing Reads onto GO Terms – LCA Approach r ar al la t ul ic l u en ec tion og ess e l on C p BLAST ref2go map ol ol M un c Bi roc P Co m GO Terms F M0 protein binding M1 response to stress signal transduction Read M2 cell communication M3 nucleus M4 cell part cytosol Placement: ? ? root root root cellular process cell communication response signal response signal response signal to stress transduction to stress transduction to stress transduction Daniel Richter – University of Tuebingen Functional Metagenome Analysis Stockholm, 09/06/27 [07]
  • 10. Placing Reads onto GO Terms – LCA Approach r ar al la t ul ic l u en ec tion og ess e l on C p BLAST ref2go map ol ol M un c Bi roc P Co m GO Terms F M0 protein binding M1 response to stress signal transduction Read M2 cell communication M3 nucleus M4 cell part cytosol Placement: ? root root root cellular process cell communication response signal response signal response signal to stress transduction to stress transduction to stress transduction Daniel Richter – University of Tuebingen Functional Metagenome Analysis Stockholm, 09/06/27 [08]
  • 11. Placing Reads onto GO Terms – LCA Approach r ar al la t ul ic l u en ec tion og ess e l on C p BLAST ref2go map ol ol M un c Bi roc P Co m GO Terms F M0 protein binding M1 response to stress signal transduction Read M2 cell communication M3 nucleus M4 cell part cytosol Placement: Daniel Richter – University of Tuebingen Functional Metagenome Analysis Stockholm, 09/06/27 [09]
  • 12. Benefits and Drawbacks of the LCA Algorithm • loss of accuracy: LCA is always less specific • might miss gene products of interest (losing the „big picture“) • reads with many different BLAST matches (= many GO terms) are likely to be assigned to high level GO terms • complexity reduction facilitates analysis and visual inspection • memory efficient: • need to store only three integers (GO IDs) per read • applicable to large data sets: 5 Mio reads, 760 GB BLAST output • loss of accuracy ≠loss of correctness (avoids false-positives) → balance between usability and accuracy Calculation example „Full Approach“: 1,000,000 reads each read: 50 BLAST matches each match: 10 GO terms → 500,000,000 GO IDs Daniel Richter – University of Tuebingen Functional Metagenome Analysis Stockholm, 09/06/27 [10]
  • 13. GO Analyzer – Main Window Daniel Richter – University of Tuebingen Functional Metagenome Analysis Stockholm, 09/06/27 [11]
  • 14. GO Analyzer – Main Window Daniel Richter – University of Tuebingen Functional Metagenome Analysis Stockholm, 09/06/27 [12]
  • 15. GO Analyzer – Main Window Daniel Richter – University of Tuebingen Functional Metagenome Analysis Stockholm, 09/06/27 [13]
  • 16. GO Analyzer – Main Window Daniel Richter – University of Tuebingen Functional Metagenome Analysis Stockholm, 09/06/27 [14]
  • 17. GO Analyzer – Main Window Daniel Richter – University of Tuebingen Functional Metagenome Analysis Stockholm, 09/06/27 [15]
  • 18. GO Analyzer – Main Window Daniel Richter – University of Tuebingen Functional Metagenome Analysis Stockholm, 09/06/27 [16]
  • 19. GO Analyzer – Main Window Daniel Richter – University of Tuebingen Functional Metagenome Analysis Stockholm, 09/06/27 [17]
  • 20. GO Analyzer – Main Window Extract reads Daniel Richter – University of Tuebingen Functional Metagenome Analysis Stockholm, 09/06/27 [18]
  • 21. GO Analyzer – Path Highlighting Daniel Richter – University of Tuebingen Functional Metagenome Analysis Stockholm, 09/06/27 [19]
  • 22. GO Analyzer – GO Slims Gene Ontology provides subsets of GO terms → useful for high level view of the three ontologies http://www.geneontology.org/GO.slims.shtml Design your own metagenomic GO slim... Daniel Richter – University of Tuebingen Functional Metagenome Analysis Stockholm, 09/06/27 [20]
  • 23. GO Analyzer – Comparison View Daniel Richter – University of Tuebingen Functional Metagenome Analysis Stockholm, 09/06/27 [21]
  • 24. GO Analyzer – Comparison View Daniel Richter – University of Tuebingen Functional Metagenome Analysis Stockholm, 09/06/27 [22]
  • 25. GO Analyzer – Comparison View Daniel Richter – University of Tuebingen Functional Metagenome Analysis Stockholm, 09/06/27 [23]
  • 26. GO Analyzer – Comparison View Daniel Richter – University of Tuebingen Functional Metagenome Analysis Stockholm, 09/06/27 [24]
  • 27. GO Analyzer – Summary • New module of MEGAN 4 to conduct functional analyses on environmental reads „BLAST only once, perform taxonomical and functional analysis in one step“ • Homology-based approach • Overview tool: visual and interactive exploration of gene products • Inspection, extraction and chart features • Comparative mode Installers for all operating systems will be available from: http://www-ab.informatik.uni-tuebingen.de/software/megan Daniel Richter – University of Tuebingen Functional Metagenome Analysis Stockholm, 09/06/27 [25]