SlideShare a Scribd company logo
1 of 14
Genomics data integration,
analysis, and curation using
        MycoCosm
          Robert Riley
         Igor Shabalov
         Igor Grigoriev

          3 April 2012
US DOE Joint Genome Institute
                       Mission: Genomics user facility in
                       support of DOE missions in:




         bioenergy                           biogeochemistry
                           carbon cycling
                                            Programs: Fungal, Plant,
                                            Microbial, & Metagomics
 Genome sequencing
 using latest technology




Illumina HiSeq2000   PacBio                 Example: JGI fungal sequencing
Fungi in bioenergy



           degrade
           lignin &          White rot
           cellulose         Phanerochaete
                             chrysosporium




                                               Biofuel
Lignocellulose               Brown rot
                             Postia          precursors
                 degrade
                 cellulose   placenta
Data integration: MycoCosm

                        Comparative View




    MycoCosm
jgi.doe.gov/fungi
 120+ genomes;
5K visitors/month      Genome-Centric View
Data Integration: External Links
From GenBank to MycoCosm
Curation: user annotation




                      Users may
                   create improved
                     gene models




                      Your name here!
                                        7
Curated genes




                                                                 1
                                                                                   100



                                                                         10
                                                                                                          10000



                                                                                              1000
                                                         Hebcy1
                                                         Hypsu1
                                                         Treme1
                                                   Agabi_varbis…
                                                          Paxin1
                                                         Phaca1
                                                            Plicr1
                                                           Jaaar1
                                                           Phlgi1
                                                       Bjead1_1
                                                         Botbo1
                                                          Phlbr1




                                        Organism
                                                         Gansp1
                                                         Wolco1
                                                         Conpu1
                                                          Cersu1
                                                     PleosPC9_1
                                                         Gymlu1
                                                    SerlaS7_9_2
                                                         Aurde1
                                                          Punst1
                                                        Glotr1_1
                                                        Fomme1
                                                   PleosPC15_2
                                                          Pospl1
                                                         Fompi1
                                                          Trave1
                                                                                                                  Statistics on manual curation of gene models




    by increasing number of curations


                                                          Schco2
                                                           Stehi1
                                                         Hetan2
                                                          Phchr1
                                                                                                                                                                 Curation: user annotation




                                                           Lacbi2
                                                         Dacsp1
                                                   Agabi_varbur…
                                                    SerlaS7_3_2
                                                          Dicsq1
                                                                     5


                                                                 0
                                                                              10
                                                                                   15
                                                                                         20
                                                                                                     25
                                                                                                          30




                                                                               Curators
8
Curation: user annotation
Example: user finds a more sensible gene model

                                                      Promote to
                                                      gene catalog
 Compare
 to ESTs

 Transcript page
                                       Protein page




Cluster viewer




                                                                     9
Analysis: genome-centric view



  GC content

 Sequence
 conservation

     Gene
    catalog
      ESTs

 PFAM domains

   BLAST hits

Alternate gene
    models
                                                 10
Analysis: comparative view




                             11
Analysis: evolution of
                      lignocellulose degradation
CAZy                               CAZy and lignin-degrading genes
genes




Oxidoreductase
genes




    Eastwood et al. Science 2011                  Riley et al. in prep
Summary
MycoCosm
• Integrates functional and comparative
  genomic data and analytical tools for
  energy and environment fungi
• Offers tools for community annotation,
  data repository, and manual curation
• Facilitates comparative genome
  analysis
Acknowledgments

Igor V. Grigoriev   Robert Otillar
Henrik Nordberg     Alex Poliakov
Igor Shabalov       Igor Ratnere
Andrea Aerts        Frank Korzeniewski
Mike Cantor         Xueling Zhao
David Goodstein     Tatyana Smirnova
Alan Kuo            Daniel Rokhsar
Simon Minovitsky    Inna Dubchak
Roman Nikitin
Robin A. Ohm

More Related Content

Viewers also liked

Bioinformatics and functional genomics
Bioinformatics and functional genomicsBioinformatics and functional genomics
Bioinformatics and functional genomics
Aisha Kalsoom
 
Genomics and bioinformatics
Genomics and bioinformatics Genomics and bioinformatics
Genomics and bioinformatics
Senthil Natesan
 

Viewers also liked (6)

Ensembl Plants: Visualising, mining and analysing crop genomics data
Ensembl Plants: Visualising, mining and analysing crop  genomics dataEnsembl Plants: Visualising, mining and analysing crop  genomics data
Ensembl Plants: Visualising, mining and analysing crop genomics data
 
Creating an integrated Ondex knowledge base for comparative gene function ana...
Creating an integrated Ondex knowledge base for comparative gene function ana...Creating an integrated Ondex knowledge base for comparative gene function ana...
Creating an integrated Ondex knowledge base for comparative gene function ana...
 
The complexity of plant genomes
The complexity of plant genomesThe complexity of plant genomes
The complexity of plant genomes
 
David
DavidDavid
David
 
Bioinformatics and functional genomics
Bioinformatics and functional genomicsBioinformatics and functional genomics
Bioinformatics and functional genomics
 
Genomics and bioinformatics
Genomics and bioinformatics Genomics and bioinformatics
Genomics and bioinformatics
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 

Genomics data integration, analysis, and curation using MycoCosm

  • 1. Genomics data integration, analysis, and curation using MycoCosm Robert Riley Igor Shabalov Igor Grigoriev 3 April 2012
  • 2. US DOE Joint Genome Institute Mission: Genomics user facility in support of DOE missions in: bioenergy biogeochemistry carbon cycling Programs: Fungal, Plant, Microbial, & Metagomics Genome sequencing using latest technology Illumina HiSeq2000 PacBio Example: JGI fungal sequencing
  • 3. Fungi in bioenergy degrade lignin & White rot cellulose Phanerochaete chrysosporium Biofuel Lignocellulose Brown rot Postia precursors degrade cellulose placenta
  • 4. Data integration: MycoCosm Comparative View MycoCosm jgi.doe.gov/fungi 120+ genomes; 5K visitors/month Genome-Centric View
  • 6. From GenBank to MycoCosm
  • 7. Curation: user annotation Users may create improved gene models Your name here! 7
  • 8. Curated genes 1 100 10 10000 1000 Hebcy1 Hypsu1 Treme1 Agabi_varbis… Paxin1 Phaca1 Plicr1 Jaaar1 Phlgi1 Bjead1_1 Botbo1 Phlbr1 Organism Gansp1 Wolco1 Conpu1 Cersu1 PleosPC9_1 Gymlu1 SerlaS7_9_2 Aurde1 Punst1 Glotr1_1 Fomme1 PleosPC15_2 Pospl1 Fompi1 Trave1 Statistics on manual curation of gene models by increasing number of curations Schco2 Stehi1 Hetan2 Phchr1 Curation: user annotation Lacbi2 Dacsp1 Agabi_varbur… SerlaS7_3_2 Dicsq1 5 0 10 15 20 25 30 Curators 8
  • 9. Curation: user annotation Example: user finds a more sensible gene model Promote to gene catalog Compare to ESTs Transcript page Protein page Cluster viewer 9
  • 10. Analysis: genome-centric view GC content Sequence conservation Gene catalog ESTs PFAM domains BLAST hits Alternate gene models 10
  • 12. Analysis: evolution of lignocellulose degradation CAZy CAZy and lignin-degrading genes genes Oxidoreductase genes Eastwood et al. Science 2011 Riley et al. in prep
  • 13. Summary MycoCosm • Integrates functional and comparative genomic data and analytical tools for energy and environment fungi • Offers tools for community annotation, data repository, and manual curation • Facilitates comparative genome analysis
  • 14. Acknowledgments Igor V. Grigoriev Robert Otillar Henrik Nordberg Alex Poliakov Igor Shabalov Igor Ratnere Andrea Aerts Frank Korzeniewski Mike Cantor Xueling Zhao David Goodstein Tatyana Smirnova Alan Kuo Daniel Rokhsar Simon Minovitsky Inna Dubchak Roman Nikitin Robin A. Ohm