SlideShare una empresa de Scribd logo
1 de 89
The Importance of History
      (and other obsessions)

        Jonathan A. Eisen
           UC Davis

Talk for Lake Arrowhead Microbial
   Genomes 2010 (#LAMG10)
Social Networking in Science
Bacterial evolve
Evolution of Lake Arrowhead
Blast Peptide
LAKEARROWHEAD
Homework

• Do blastp search with other famous people
  associated with Lake Arrowhead Meeting
• JEFFREYHMILLER
• SARAHPALIN and her relationship to fungi
  B. fuckeliana
• see http://phylogenomics.blogspot.com/
  2008/09/tracing-evolutionary-history-of-
  sarah.html
2010
2008
2006
2004
No
2002
Wayback Machine
2002
Quotes 2004
• Space-time continuum of genes and genomes
• Gene sequences are the wormhole that allows
  one to tunnel into the past
• The human mind can conceive of things with no
  basis in physical reality
• Thoughts can go faster than the speed of light
Quotes 2006

• The human guts are a real milieu of stuff
• You better kiss everybody
• Microbes not only have a lot of sex, they have a
  lot of weird sex
• This is how you do metagenomics on 50
  dollars, and that’s Canadian dollars
Quotes 2008
• Antibiotics do not kill things, they corrupt them
• There comes a point in life when you have to bring
  chemists into the picture
• The rectal swabs are here in tan color
• And there's Jeffrey Dahmer
• We are the environment. We live the phenotype.
• If I have time I will tell you about a dream
• A paper came out next year
Quotes 2010
•   We have been using this word for many years without actually realizing it
    was correct
•   Another thing you need to know" pause "Actually you don't NEED to
    know any of this
•   "I have been influenced by Fisher Price throughout my life
•   Don't take that away from us
•   It takes 1000 nanobiologists to make one microbiologist
•   I am going to wrap up as I hear the crickets chirping
•   And we will bring out the unused cheese from yesterday
•   In an engineering sense, the vagina is a simple plug flow reactor
•   This is going to be ironic coming from someone who studies circumcision
•   A little bit about time, but I am going to spend a lot less time on time than
    on space
Keywords I remember from 2010
• Penis
• Vagina
• Anthrax
• Acne
• Ulcer (multiple kinds)
• Global warming
• Antibiotic resistance
• Virulence

                           24
rRNA Tree of Life
Bacteria




                                  Archaea




 Eukaryotes

   FIgure from Barton, Eisen et al.
      “Evolution”, CSHL Press.
  Based on tree from Pace NR, 2003.
Proteobacteria

2002   TM6
       OS-K
       Acidobacteria
                               • At least 40
       Termite Group
       OP8
                                 phyla of
       Nitrospira
       Bacteroides
                                 bacteria
       Chlorobi
       Fibrobacteres
       Marine GroupA
       WS3
       Gemmimonas
       Firmicutes
       Fusobacteria
       Actinobacteria
       OP9
       Cyanobacteria
       Synergistes
       Deferribacteres
       Chrysiogenetes
       NKB19
       Verrucomicrobia
       Chlamydia
       OP3
       Planctomycetes
       Spriochaetes
       Coprothmermobacter
       OP10
       Thermomicrobia
       Chloroflexi
       TM7
       Deinococcus-Thermus
       Dictyoglomus
       Aquificae
       Thermudesulfobacteria
       Thermotogae
       OP1                       Based on Hugenholtz,
       OP11                      2002
2002
       Proteobacteria
       TM6
       OS-K
                               • At least 40
       Acidobacteria
       Termite Group
       OP8
                                 phyla of
       Nitrospira
       Bacteroides
                                 bacteria
       Chlorobi
       Fibrobacteres
       Marine GroupA
                               • Genome
       WS3
       Gemmimonas                sequences are
       Firmicutes
       Fusobacteria              mostly from
       Actinobacteria
       OP9
       Cyanobacteria
                                 three phyla
       Synergistes
       Deferribacteres
       Chrysiogenetes
       NKB19
       Verrucomicrobia
       Chlamydia
       OP3
       Planctomycetes
       Spriochaetes
       Coprothmermobacter
       OP10
       Thermomicrobia
       Chloroflexi
       TM7
       Deinococcus-Thermus
       Dictyoglomus
       Aquificae
       Thermudesulfobacteria
       Thermotogae
       OP1                       Based on Hugenholtz,
       OP11                      2002
2002
       Proteobacteria
       TM6
       OS-K
                               • At least 40
       Acidobacteria
       Termite Group
       OP8
                                 phyla of
       Nitrospira
       Bacteroides
                                 bacteria
       Chlorobi
       Fibrobacteres
       Marine GroupA
                               • Genome
       WS3
       Gemmimonas                sequences are
       Firmicutes
       Fusobacteria              mostly from
       Actinobacteria
       OP9
       Cyanobacteria
                                 three phyla
       Synergistes
       Deferribacteres
       Chrysiogenetes          • Some other
       NKB19
       Verrucomicrobia
       Chlamydia
                                 phyla are only
       OP3
       Planctomycetes
       Spriochaetes
                                 sparsely
       Coprothmermobacter
       OP10
                                 sampled
       Thermomicrobia
       Chloroflexi
       TM7
       Deinococcus-Thermus
       Dictyoglomus
       Aquificae
       Thermudesulfobacteria
       Thermotogae
       OP1                       Based on Hugenholtz,
       OP11                      2002
2002
       Proteobacteria
       TM6
       OS-K
                               • At least 40
       Acidobacteria
       Termite Group
       OP8
                                 phyla of
       Nitrospira
       Bacteroides
                                 bacteria
       Chlorobi
       Fibrobacteres
       Marine GroupA
                               • Genome
       WS3
       Gemmimonas                sequences are
       Firmicutes
       Fusobacteria              mostly from
       Actinobacteria
       OP9
       Cyanobacteria
                                 three phyla
       Synergistes
       Deferribacteres
       Chrysiogenetes          • Some other
       NKB19
       Verrucomicrobia
       Chlamydia
                                 phyla are only
       OP3
       Planctomycetes
       Spriochaetes
                                 sparsely
       Coprothmermobacter
       OP10
                                 sampled
       Thermomicrobia
       Chloroflexi
       TM7
       Deinococcus-Thermus
       Dictyoglomus
       Aquificae
       Thermudesulfobacteria
       Thermotogae
       OP1                       Based on Hugenholtz,
       OP11                      2002
Why Increase Phylogenetic Coverage?
  • Common approach within some eukaryotic
    groups (FGP, NHGRI, etc)
  • Many successful small projects to fill in
    bacterial or archaeal gaps
  • Phylogenetic gaps in bacterial and archaeal
    projects commonly lamented in literature
  • Many potential benefits
Proteobacteria
• NSF-funded        TM6                     • At least 40 phyla
                    OS-K
  Tree of Life      Acidobacteria
                    Termite Group             of bacteria
                    OP8
  Project           Nitrospira
                                            • Genome
                    Bacteroides
                    Chlorobi
• A genome          Fibrobacteres
                    Marine GroupA
                                              sequences are
  from each of      WS3
                    Gemmimonas                mostly from
  eight phyla       Firmicutes
                    Fusobacteria              three phyla
                    Actinobacteria
                    OP9
                    Cyanobacteria
                    Synergistes
                                            • Some other
                    Deferribacteres
                    Chrysiogenetes            phyla are only
                    NKB19
                    Verrucomicrobia
                    Chlamydia
                                              sparsely sampled
                    OP3
                    Planctomycetes
                    Spriochaetes
                                            • Solution I:
                    Coprothmermobacter
                    OP10                      sequence more
                    Thermomicrobia
                    Chloroflexi
                    TM7
                                              phyla
                    Deinococcus-Thermus
                    Dictyoglomus
                    Aquificae
Eisen & Ward, PIs   Thermudesulfobacteria
                    Thermotogae
                    OP1
                    OP11
Proteobacteria
• NSF-funded        TM6                     • At least 40 phyla
                    OS-K
  Tree of Life      Acidobacteria
                    Termite Group             of bacteria
                    OP8
  Project           Nitrospira
                                            • Genome
                    Bacteroides
                    Chlorobi
• A genome          Fibrobacteres
                    Marine GroupA
                                              sequences are
  from each of      WS3
                    Gemmimonas                mostly from
  eight phyla       Firmicutes
                    Fusobacteria              three phyla
                    Actinobacteria
                    OP9
                    Cyanobacteria
                    Synergistes
                                            • Some other
                    Deferribacteres
                    Chrysiogenetes            phyla are only
                    NKB19
                    Verrucomicrobia
                    Chlamydia
                                              sparsely sampled
                    OP3
                    Planctomycetes
                    Spriochaetes
                                            • Still highly
                    Coprothmermobacter
                    OP10                      biased in terms
                    Thermomicrobia
                    Chloroflexi
                    TM7
                                              of the tree
                    Deinococcus-Thermus
                    Dictyoglomus
                    Aquificae
Eisen & Ward, PIs   Thermudesulfobacteria
                    Thermotogae
                    OP1
                    OP11
Major Lineages of Actinobacteria
                                                      2.5 Actinobacteria
                                         2.5.1            Acidimicrobidae
        2.5.1      Acidimicrobidae       2.5.1.1          Unclassified
                                         2.5.1.2          "Microthrixineae
        2.5.1.1    Unclassified          2.5.1.3          Acidimicrobineae
                                         2.5.1.3.1        Unclassified
        2.5.1.2    "Microthrixineae      2.5.1.3.2        Acidimicrobiaceae
                                         2.5.1.4          BD2-10
        2.5.1.3    Acidimicrobineae      2.5.1.5          EB1017
                                         2.5.2            Actinobacteridae
        2.5.1.4    BD2-10                2.5.2.1          Unclassified
                                         2.5.2.10         Ellin306/WR160
        2.5.1.5    EB1017                2.5.2.11         Ellin5012
                                         2.5.2.12         Ellin5034
        2.5.2      Actinobacteridae      2.5.2.13         Frankineae
                                         2.5.2.13.1       Unclassified
        2.5.2.1    Unclassified          2.5.2.13.2       Acidothermaceae
                                         2.5.2.13.3       Ellin6090
        2.5.2.10   Ellin306/WR160        2.5.2.13.4       Frankiaceae

        2.5.2.11   Ellin5012             2.5.2.13.5
                                         2.5.2.13.6
                                                          Geodermatophilaceae
                                                          Microsphaeraceae

        2.5.2.12   Ellin5034             2.5.2.13.7
                                         2.5.2.14
                                                          Sporichthyaceae
                                                          Glycomyces

        2.5.2.13   Frankineae            2.5.2.15
                                         2.5.2.15.1
                                                          Intrasporangiaceae
                                                          Unclassified
        2.5.2.14   Glycomyces            2.5.2.15.2
                                         2.5.2.15.3
                                                          Dermacoccus
                                                          Intrasporangiaceae
        2.5.2.15   Intrasporangiaceae    2.5.2.16
                                         2.5.2.17
                                                          Kineosporiaceae
                                                          Microbacteriaceae
        2.5.2.16   Kineosporiaceae       2.5.2.17.1
                                         2.5.2.17.2
                                                          Unclassified
                                                          Agrococcus
        2.5.2.17   Microbacteriaceae     2.5.2.17.3
                                         2.5.2.18
                                                          Agromyces
                                                          Micrococcaceae
        2.5.2.18   Micrococcaceae        2.5.2.19
                                         2.5.2.2
                                                          Micromonosporaceae
                                                          Actinomyces
        2.5.2.19   Micromonosporaceae    2.5.2.20
                                         2.5.2.20.1
                                                          Propionibacterineae
                                                          Unclassified
        2.5.2.2    Actinomyces           2.5.2.20.2
                                         2.5.2.20.3
                                                          Kribbella
                                                          Nocardioidaceae
        2.5.2.20   Propionibacterineae   2.5.2.20.4
                                         2.5.2.21
                                                          Propionibacteriaceae
                                                          Pseudonocardiaceae
        2.5.2.21   Pseudonocardiaceae    2.5.2.22
                                         2.5.2.22.1
                                                          Streptomycineae
                                                          Unclassified
        2.5.2.22   Streptomycineae       2.5.2.22.2
                                         2.5.2.22.3
                                                          Kitasatospora
                                                          Streptacidiphilus
        2.5.2.23   Streptosporangineae   2.5.2.23
                                         2.5.2.23.1
                                                          Streptosporangineae
                                                          Unclassified
        2.5.2.3    Actinomycineae        2.5.2.23.2
                                         2.5.2.23.3
                                                          Ellin5129
                                                          Nocardiopsaceae
        2.5.2.4    Actinosynnemataceae   2.5.2.23.4
                                         2.5.2.23.5
                                                          Streptosporangiaceae
                                                          Thermomonosporaceae
        2.5.2.5    Bifidobacteriaceae    2.5.2.3
                                         2.5.2.4
                                                          Actinomycineae
                                                          Actinosynnemataceae
        2.5.2.6    Brevibacteriaceae     2.5.2.5          Bifidobacteriaceae
                                         2.5.2.6          Brevibacteriaceae
        2.5.2.7    Cellulomonadaceae     2.5.2.7          Cellulomonadaceae
                                         2.5.2.8          Corynebacterineae
        2.5.2.8    Corynebacterineae     2.5.2.8.1        Unclassified
                                         2.5.2.8.2        Corynebacteriaceae
        2.5.2.9    Dermabacteraceae      2.5.2.8.3        Dietziaceae
                                         2.5.2.8.4        Gordoniaceae
        2.5.3      Coriobacteridae       2.5.2.8.5        Mycobacteriaceae
                                         2.5.2.8.6        Rhodococcus
        2.5.3.1    Unclassified          2.5.2.8.7        Rhodococcus
                                         2.5.2.8.8        Rhodococcus
        2.5.3.2    Atopobiales           2.5.2.9          Dermabacteraceae
                                         2.5.2.9.1        Unclassified
        2.5.3.3    Coriobacteriales      2.5.2.9.2        Brachybacterium
                                         2.5.2.9.3        Dermabacter
        2.5.3.4    Eggerthellales        2.5.3            Coriobacteridae
                                         2.5.3.1          Unclassified
        2.5.4      OPB41                 2.5.3.2          Atopobiales
                                         2.5.3.3          Coriobacteriales
        2.5.5      PK1                   2.5.3.4          Eggerthellales
                                         2.5.4            OPB41
        2.5.6      Rubrobacteridae       2.5.5            PK1
                                         2.5.6            Rubrobacteridae
        2.5.6.1    Unclassified          2.5.6.1          Unclassified
                                         2.5.6.2          "Thermoleiphilaceae
        2.5.6.2    "Thermoleiphilaceae   2.5.6.2.1        Unclassified
                                         2.5.6.2.2        Conexibacter
        2.5.6.3    MC47                  2.5.6.2.3        XGE514
                                         2.5.6.3          MC47
        2.5.6.4    Rubrobacteraceae      2.5.6.4          Rubrobacteraceae
Proteobacteria
• NSF-funded        TM6                     • At least 40 phyla
                    OS-K
  Tree of Life      Acidobacteria
                    Termite Group             of bacteria
                    OP8
  Project           Nitrospira
                                            • Genome
                    Bacteroides
                    Chlorobi
• A genome          Fibrobacteres
                    Marine GroupA
                                              sequences are
  from each of      WS3
                    Gemmimonas                mostly from
  eight phyla       Firmicutes
                    Fusobacteria              three phyla
                    Actinobacteria
                    OP9
                    Cyanobacteria
                    Synergistes
                                            • Some other
                    Deferribacteres
                    Chrysiogenetes            phyla are only
                    NKB19
                    Verrucomicrobia
                    Chlamydia
                                              sparsely sampled
                    OP3
                    Planctomycetes
                    Spriochaetes
                                            • Same trend in
                    Coprothmermobacter
                    OP10                      Archaea
                    Thermomicrobia
                    Chloroflexi
                    TM7
                    Deinococcus-Thermus
                    Dictyoglomus
                    Aquificae
Eisen & Ward, PIs   Thermudesulfobacteria
                    Thermotogae
                    OP1
                    OP11
Proteobacteria
• NSF-funded        TM6                     • At least 40 phyla
                    OS-K
  Tree of Life      Acidobacteria
                    Termite Group             of bacteria
                    OP8
  Project           Nitrospira
                                            • Genome
                    Bacteroides
                    Chlorobi
• A genome          Fibrobacteres
                    Marine GroupA
                                              sequences are
  from each of      WS3
                    Gemmimonas                mostly from
  eight phyla       Firmicutes
                    Fusobacteria              three phyla
                    Actinobacteria
                    OP9
                    Cyanobacteria
                    Synergistes
                                            • Some other
                    Deferribacteres
                    Chrysiogenetes            phyla are only
                    NKB19
                    Verrucomicrobia
                    Chlamydia
                                              sparsely sampled
                    OP3
                    Planctomycetes
                    Spriochaetes
                                            • Same trend in
                    Coprothmermobacter
                    OP10                      Eukaryotes
                    Thermomicrobia
                    Chloroflexi
                    TM7
                    Deinococcus-Thermus
                    Dictyoglomus
                    Aquificae
Eisen & Ward, PIs   Thermudesulfobacteria
                    Thermotogae
                    OP1
                    OP11
Proteobacteria
• NSF-funded        TM6                     • At least 40 phyla
                    OS-K
  Tree of Life      Acidobacteria
                    Termite Group             of bacteria
                    OP8
  Project           Nitrospira
                                            • Genome
                    Bacteroides
                    Chlorobi
• A genome          Fibrobacteres
                    Marine GroupA
                                              sequences are
  from each of      WS3
                    Gemmimonas                mostly from
  eight phyla       Firmicutes
                    Fusobacteria              three phyla
                    Actinobacteria
                    OP9
                    Cyanobacteria
                    Synergistes
                                            • Some other
                    Deferribacteres
                    Chrysiogenetes            phyla are only
                    NKB19
                    Verrucomicrobia
                    Chlamydia
                                              sparsely sampled
                    OP3
                    Planctomycetes
                    Spriochaetes
                                            • Same trend in
                    Coprothmermobacter
                    OP10                      Viruses
                    Thermomicrobia
                    Chloroflexi
                    TM7
                    Deinococcus-Thermus
                    Dictyoglomus
                    Aquificae
Eisen & Ward, PIs   Thermudesulfobacteria
                    Thermotogae
                    OP1
                    OP11
Proteobacteria
• GEBA              TM6
                    OS-K                    • At least 40 phyla
                    Acidobacteria
• A genomic         Termite Group
                    OP8
                                              of bacteria
  encyclopedia      Nitrospira
                    Bacteroides             • Genome
                    Chlorobi
  of bacteria and   Fibrobacteres
                    Marine GroupA             sequences are
  archaea           WS3
                    Gemmimonas                mostly from
                    Firmicutes
                    Fusobacteria
                    Actinobacteria
                                              three phyla
                    OP9
                    Cyanobacteria
                    Synergistes
                                            • Some other
                    Deferribacteres
                    Chrysiogenetes            phyla are only
                    NKB19
                    Verrucomicrobia
                    Chlamydia                 sparsely sampled
                    OP3
                    Planctomycetes
                    Spriochaetes            • Solution: Really
                    Coprothmermobacter
                    OP10
                    Thermomicrobia
                                              Fill in the Tree
                    Chloroflexi
                    TM7
                    Deinococcus-Thermus
                    Dictyoglomus
                    Aquificae
Eisen & Ward, PIs   Thermudesulfobacteria
                    Thermotogae
                    OP1
                    OP11
GEBA Pilot Project Overview
• Identify major branches in rRNA tree for
  which no genomes are available
• Identify those with a cultured representative in
  DSMZ
• DSMZ grew > 200 of these and prepped DNA
• Sequence and finish 100+ (covering breadth of
  bacterial/archaea diversity)
• Annotate, analyze, release data
• Assess benefits of tree guided sequencing
• 1st paper Wu et al in Nature Dec 2009
GEBA Pilot Project: Components
• Project overview (Phil Hugenholtz, Nikos Kyrpides, Jonathan Eisen,
  Eddy Rubin, Jim Bristow, Tanya Woyke)
• Project management (David Bruce, Eileen Dalin, Lynne Goodwin)
• Culture collection and DNA prep (DSMZ, Hans-Peter Klenk)
• Sequencing and closure (Eileen Dalin, Susan Lucas, Alla Lapidus, Mat
  Nolan, Alex Copeland, Cliff Han, Feng Chen, Jan-Fang Cheng)
• Annotation and data release (Nikos Kyrpides, Victor Markowitz, et al)
• Analysis (Dongying Wu, Kostas Mavrommatis, Martin Wu, Victor
  Kunin, Neil Rawlings, Ian Paulsen, Patrick Chain, Patrik D’Haeseleer,
  Sean Hooper, Iain Anderson, Amrita Pati, Natalia N. Ivanova,
  Athanasios Lykidis, Adam Zemla)
• Adopt a microbe education project (Cheryl Kerfeld)
• Outreach (David Gilbert)
• $$$ (DOE, DSMZ, GBMF)
GEBA and Openness
• All data released as quickly as
  possible w/ no restrictions to
  IMG-GEBA; Genbank, etc
• Data also available in
  Biotorrents (http://
  biotorrents.net)
• Individual genome reports
  published in OA “Standards in
  Genome Sciences (SIGS)”
• 1st GEBA paper in Nature freely
  available and published using
  Creative Commons License
                                    43
GEBA Lesson 1

rRNA Tree is Useful for Identifying
Phylogenetically Novel Organisms



                                  44
rRNA Tree of Life
Bacteria




                                  Archaea




 Eukaryotes

   FIgure from Barton, Eisen et al.
      “Evolution”, CSHL Press.
  Based on tree from Pace NR, 2003.
Network of Life?
Bacteria




                                  Archaea




 Eukaryotes

   Figure from Barton, Eisen et al.
      “Evolution”, CSHL Press.
  Based on tree from Pace NR, 2003.
Compare PD in rRNA and WGT
PD of rRNA, Genome Trees Similar




From Wu et al. 2009 Nature 462, 1056-1060
GEBA Lesson 2

Phylogeny-driven genome selection
helps discover new genetic diversity
Network of Life?
Bacteria




                                  Archaea




 Eukaryotes

   FIgure from Barton, Eisen et al.
      “Evolution”, CSHL Press.
  Based on tree from Pace NR, 2003.
Protein Family Rarefaction
               Curves
• Take data set of multiple complete genomes
• Identify all protein families using MCL
• Plot # of genomes vs. # of protein families
Synapomorphies exist
Phylogenetic Distribution Novelty:
               Bacterial Actin Related Protein
                                                                           2"#3)&4&*&& !"#*)$*),+%
                                                                           5"#$-.-6&0&1- !"#$%,$-%)(
                                                                          7"#0(1.8-9& !"#$''+-+,',!
                                                                          5"#:1,)*&$/0 !"#&$,%+)+-+                                   ! " #$%
                                                                            !"#$%&'()*&& !"#$%&'(%()
                                                                    ((      +"#,-.(/01 !"#*+,**'+(
                                                                         ;"#01,&-*0 !"#%*+$--(
                                                                        <"#$-.-3.1%&0 !"#%',&'-+)
                                                                        ')     2"#$&*-.-1 !"#$'(-%%+&$
                                                                                  ="#$.1001 !"#-*$+$(&(                                 ! &’ (
                                                                      $++          >"#0$1,/%1.&0 !"#&$**+),)-!
                                                               *$          $++ ;"#01,&-*0 !"#*+,$*'(
                                                                                '*        5"#:1,)*&$/0 !"#&$,%+%-%%
                                                                             $++         5"#$-.-6&0&1- !"#',&+$)*
                                                                                                                                       ! &’ )
                                                                                         ?"#@-%1*)A10(-. !"#&%'%&*%*
                                                                                $++ B"#A1%%/0# "#%*,-&*'(
                                                                                    )*     2"#*-)').@1*0 !"#*-&'''(+
                                                                                            5"#$-.-6&0&1- !"#',&&*&*                   ! &’ *
                                                                                 $++       ?"#@-%1*)A10(-. !"#$)),)*%,
                                                                                    $++ ;"#01,&-*0 !"#*+,$*),!
                                                                                             ;"#)$C.1$-/@ !"#&&),(*((-                 + ! &’
                                                                                                  5"#$-.-6&0&1- !"#$++-&%%!
                                                                ),                    ."#,1(-*0 !"#$'-+*$((&!                          ! &’ ,
                                                                            ((      !"#(C1%&1*1 !"#$-,(%'+-!
                                                                                   (%                 5"#$-.-6&0&1- !"#$,+$(,&
                                                                          $++                          5"#:1,)*&$/0 !"#&$,%+-,(,!      ! &’ -
                                                       -)                                          ?"#4&0$)&4-/@ !"#''-+&%$-
                                                                 )%                                  ?"#@-%1*)A10(-. !"#$)),),%)
                                                                         ()                                   5"#$-.-6&0&1- !"#',&,$$%
                                                                                      $++               ?"#C1*0-*&&!"#&$-*$ $(&$       ! &’ .
                                                                                     $++     D"#01(&61 !"#$-&'*)%&+!
                                                                                              !"#(C1%&1*1!"#$-%$ $),)                  ! &’ /
                                                                                       ?"#@-%1*)A1(-. !"#$((&+,*-
                                                                $++               <"#@/0$/%/0 !"#&&'&%'*(,                           ! &’ ( 0


                                                        +/*!



Haliangium ochraceum DSM 14365                                                     Patrik D’haeseleer, Adam Zemla,
                                                                                             Victor Kunin

                            See also Guljamow et al. 2007 Current Biology.
GEBA Lesson 3

Phylogeny-driven genome selection
   improves genome annotation
Most/All Functional Prediction Improves
     w/ Better Phylogenetic Sampling
  • Took 56 GEBA genomes and compared results vs. 56
    randomly sampled new genomes
  • Better definition of protein family sequence “patterns”
  • Greatly improves “comparative” and “evolutionary”
    based predictions
  • Conversion of hypothetical into conserved hypotheticals
  • Linking distantly related members of protein families
  • Improved non-homology prediction
  Kostas       Natalia     Thanos       Nikos       Iain
Mavrommatis   Ivanova      Lykidis     Kyrpides   Anderson
GEBA Lesson 4

Metadata and individual genome
       papers important
SIGS
http://standardsingenomics.org/
GEBA Lesson 5

 Phylogeny-driven genome selection
improves analysis of metagenome data
genomes
                                                    if no reference
                                                  • Assigning reads to
                                                    phylogenetic groups
                                                    using multiple genes
                                                  • Phylogenetic binning




                                                  • Phylogenetic ecology
                                                    - especially important
                                                                                                                                                          Weighted % of Clones
          Al
             ph
                  ap
                     ro
                                                                                                                                                      0
                                                                                                                                                            0.1250
                                                                                                                                                                     0.2500
                                                                                                                                                                              0.3750
                                                                                                                                                                                       0.5000




            Be            te                                                                                        Al
                 ta          ob                                                                                       ph
      G




                                              0
                                                  0.1
                                                        0.2
                                                              0.3
                                                                    0.4
                                                                          0.5
                                                                                0.6
                                                                                           0.7




                    pr           ac                                                                                       a
         am            ot           te
              m            eo           ria                                                                          Be pro
                  ap           ba
                     ro           ct                                                                                     ta teo
          D               te          er                                                                         G          p         b
             el              ob          ia
                 ta                                                                                                am rot ac
                    pr           ac
       Ep              ot           te
U          si
               lo          eo           ria                                                                            m          eo te
  nc                           ba
                                                                                                                                      ba ria
     la           np                                                                                             Ep ap
        ss           ro           ct                                                                                                      ct
           ifi            te          er                                                                             si rot
               ed            ob          ia                                                                            lo
                    Pr           ac                                                                                       n       eo eria
                       ot           te                                                                              De pr ba
                           eo           ria
                               ba                                                                                       lta ote cte
                    Cy            ct                                                                                       pr ob ria
                        an            er
                             ob          ia                                                                                    o a
                                 ac                                                                                         C teo cte
                        Ch          te                                                                                        ya b ri
                                        ria
                             la                                                                                                   no ac a
                                m                                                                                                     b te
                     Ac            yd
                          id           ia
                             ob           e                                                                                       Fi act ria
                                                                                                                                     rm er
                     Ba act
                          ct          er
                                         ia
                                                                                                                            Ac           ic ia
                                                                                                                                                                                                                 Uses of phylogenetic




                             er                                                                                                            ut
                    Ac          oi                                                                                              tin           es
                                   de
                        tin            te                                                                                           ob
                             ob           s                                                                                            a
                                 ac
                                    te                                                                                               C cte
                                        ria                                                                                            hl ri
                             Aq                                                                                                           or a
                  Pl             ui
                     an             fic                                                                                                      ob
                         ct
                            om ae                                                                                                          C i
                                 yc                                                                                                           FB
                      Sp             et                                                                                           C
                           iro           es                                                                                         hl
                                ch                                                                                                     o
                                   ae
                                       te
                                                                                      Major Phylogenetic Group




                           Fi
                                                                                                                              Sp rof
                              rm          s
                                  ic
                                                                                                                                  iro lex
                                                                                                                                                  i
                                                                                                                                                                                                Sargasso Phylotypes




                                     ut
                                                                                                                                                                                                            classification in metagenomics




                           Ch           es                                                                                    Fu cha
                               lo
                                 ro                                                                              De
         U                           fle
                                                                                                                                  so ete
           nc                            xi                                                                         in                ba s
                la            Ch                                                                                      oc
                   ss             lo                                                                                                      ct
                      ifi             ro                                                                                  oc
                          ed             bi
                                                                                                                                             er
                               Ba                                                                                           Ecus                ia
                                  ct                                                                                          ur -
                                      er
                                         ia                                                                                      yaTh
                                                                                                                           C rcherm
                                                                                                                              re
                                                                                                                                 na aeous
frr




tsf




                                                                                                                                               t
pgk




rplL
rplF




rplP

rplT
rplE
infC




rpsI
rplS
rplA
rplB




rplK
rplC




rpsJ




                                                                                                                                    rc
rplN
rplD




rplM




rpsE




rpsS
rpsB




rpsK
rpsC
rpoB




rpsM
pyrG
nusA
dnaG




rpmA




smpB




                                                                                                                                       ha a
                                                                                                                                           eo
                                                                                                                                               ta
genomes
                                                    if no reference
                                                    phylogenetic groups
                                                    using multiple genes
                                                              Limited

                                                  • Phylogenetic binning




                                                  • Phylogenetic ecology
                                                    - especially important
                                                              sampling
                                                                                                                                                          Weighted % of Clones
          Al
             ph
                  ap
                     ro
                                                                                                                                                      0
                                                                                                                                                            0.1250
                                                                                                                                                                     0.2500
                                                                                                                                                                              0.3750
                                                                                                                                                                                       0.5000




            Be            te                                                                                        Al
                 ta          ob                                                                                       ph
      G




                                              0
                                                  0.1
                                                        0.2
                                                              0.3
                                                                    0.4
                                                                          0.5
                                                                                0.6
                                                                                           0.7




                    pr           ac                                                                                       a
                                                              poor genomic


         am            ot           te
              m            eo           ria                                                                          Be pro
                  ap           ba
                     ro           ct                                                                                     ta teo
          D               te          er                                                                         G          p         b
             el              ob          ia
                 ta
                                                  • Assigning reads to in past




                    pr           ac                                                                                am rot ac
       Ep              ot           te
U          si
               lo          eo           ria                                                                            m          eo te
  nc                           ba
                                                                                                                                      ba ria
     la           np                                                                                             Ep ap
        ss           ro           ct                                                                                                      ct
           ifi            te          er                                                                             si rot
               ed            ob          ia                                                                            lo
                    Pr           ac                                                                                       n       eo eria
                       ot           te                                                                              De pr ba
                           eo           ria
                               ba                                                                                       lta ote cte
                    Cy            ct                                                                                       pr ob ria
                        an            er
                             ob          ia                                                                                    o a
                                                                                                                                                                        by




                                 ac                                                                                         C teo cte
                        Ch          te                                                                                        ya b ri
                                        ria
                             la                                                                                                   no ac a
                                m                                                                                                     b te
                     Ac            yd
                          id           ia
                             ob           e                                                                                       Fi act ria
                                                                                                                                     rm er
                     Ba act
                          ct          er
                                         ia
                                                                                                                            Ac           ic ia
                                                                                                                                                                                                                 Uses of phylogenetic




                             er                                                                                                            ut
                    Ac          oi                                                                                              tin           es
                                   de
                        tin            te                                                                                           ob
                             ob           s                                                                                            a
                                 ac
                                    te                                                                                               C cte
                                        ria                                                                                            hl ri
                             Aq                                                                                                           or a
                  Pl             ui
                     an             fic                                                                                                      ob
                         ct
                            om ae                                                                                                          C i
                                 yc                                                                                                           FB
                      Sp             et                                                                                           C
                           iro           es                                                                                         hl
                                ch                                                                                                     o
                                   ae
                                       te
                                                                                      Major Phylogenetic Group




                           Fi
                                                                                                                              Sp rof
                              rm          s
                                  ic
                                                                                                                                  iro lex
                                                                                                                                                  i
                                                                                                                                                                                                Sargasso Phylotypes




                                     ut
                                                                                                                                                                                                            classification in metagenomics




                           Ch           es                                                                                    Fu cha
                               lo
                                 ro                                                                              De
         U                           fle
                                                                                                                                  so ete
           nc                            xi                                                                         in                ba s
                la            Ch                                                                                      oc
                   ss             lo                                                                                                      ct
                      ifi             ro                                                                                  oc
                          ed             bi
                                                                                                                                             er
                               Ba                                                                                           Ecus                ia
                                  ct                                                                                          ur -
                                      er
                                         ia                                                                                      yaTh
                                                                                                                           C rcherm
                                                                                                                              re
                                                                                                                                 na aeous
frr




tsf




                                                                                                                                               t
pgk




rplL
rplF




rplP

rplT
rplE
infC




rpsI
rplS
rplA
rplB




rplK
rplC




rpsJ




                                                                                                                                    rc
rplN
rplD




rplM




rpsE




rpsS
rpsB




rpsK
rpsC
rpoB




rpsM
pyrG
nusA
dnaG




rpmA




smpB




                                                                                                                                       ha a
                                                                                                                                           eo
                                                                                                                                               ta
Metagenomic Analysis Improves
  w/ Phylogenetic Sampling
• Small but real improvements in
 –Gene identification / confirmation
 –Functional prediction
 –Binning
 –Phylogenetic classification
Metagenomic Analysis Improves
  w/ Phylogenetic Sampling
• Small but real improvements in
  –Gene identification / confirmation
  –Functional prediction
  –Binning
  –Phylogenetic classification
• But not a lot ...
GEBA Future 1

    Need to adapt genomic and
metagenomic methods to make use of
           GEBA data
Al
                                                   ph
                                                       ap
                                                         ro
                                                  Be           te
                                                      ta          ob
                                           G




                                                                                    0
                                                                                        0.1
                                                                                              0.2
                                                                                                     0.3
                                                                                                            0.4
                                                                                                                   0.5
                                                                                                                           0.6
                                                                                                                                 0.7
                                                         pr           ac
                                              am            ot            te
                                                   m            eo            ria
                                                       ap           ba
                                                          ro            ct
                                               D               te           er
                                                  el              ob           ia
                                                      ta
                                                         pr           ac
                                            Ep              ot            te
                                      U         si
                                                    lo          eo            ria
                                        nc                          ba
                                          la           np
                                             ss           ro            ct
                                                ifi            te           er
                                                    ed            ob           ia
                                                         Pr           ac
                                                            ot            te
                                                                eo            ria
                                                                    ba
                                                         Cy             ct
                                                             an             er
                                                                  ob           ia
                                                                      ac
                                                             Ch           te
                                                                              ria
                                                                  la
                                                                     m
                                                          Ac             yd
                                                               id            ia
                                                                  ob            e
                                                          Ba act
                                                               ct           er
                                                                  er           ia
                                                         Ac          oi
                                                                         de
                                                             tin             te
                                                                  ob            s
                                                                      ac
                                                                          te
                                                                              ria
                                                                  Aq
                                                       Pl             ui
                                                          an              fic
                                                              ct
                                                                 om ae
                                                                      yc
                                                           Sp              et



AMPHORA - each read on its own tree
                                                                iro            es
                                                                     ch
                                                                         ae
                                                                Fi           te
                                                                   rm           s
                                                                        ic
                                                                           ut
                                                                Ch            es
                                                                                                    Improves with better




                                                                    lo
                                                                       ro
                                              U                            fle
                                                nc
                                                                                                    phylogenetic methods




                                                     la                        xi
                                                        ss         Ch
                                                           ifi          lo
                                                               ed           ro
                                                                               bi
                                                                    Ba
                                                                        ct
                                                                            er
                                                                               ia
                                                                                                                                       Phylogenetic Binning Using AMPHORA
                                      frr




                                      tsf
                                      pgk




                                      rplL
                                      rplF




                                      rplP

                                      rplT
                                      rplE
                                      infC




                                      rpsI
                                      rplS
                                      rplA
                                      rplB




                                      rplK
                                      rplC




                                      rpsJ
                                      rplN
                                      rplD




                                      rplM




                                      rpsE




                                      rpsS
                                      rpsB




                                      rpsK
                                      rpsC
                                      rpoB




                                      rpsM
                                      pyrG
                                      nusA
                                      dnaG




                                      rpmA




                                      smpB
Improving Phylogeny for
         Metagenomic Reads
• Examples using reference trees
  – AMPHORA (Wu and Eisen)
  – PPlacer (Erik Matsen)
  – FastTree (Morgan Price)
• Variants
  – Use concatenated alignment of markers not just
    individual genes (Steven Kembel)
  – Apply to OTU identification not just classification
    (Thomas Sharpton)
  – CoBinning: look for linkage among fragments/genes
    (Aaron Darling)
Al
                                                   ph
                                                       ap
                                                         ro
                                                  Be           te
                                                      ta          ob
                                           G




                                                                                    0
                                                                                        0.1
                                                                                              0.2
                                                                                                     0.3
                                                                                                           0.4
                                                                                                                  0.5
                                                                                                                         0.6
                                                                                                                               0.7
                                                         pr           ac
                                              am            ot            te
                                                   m            eo            ria
                                                       ap           ba
                                                          ro            ct
                                               D               te           er
                                                  el              ob           ia
                                                      ta
                                                         pr           ac
                                            Ep              ot            te
                                      U         si
                                                    lo          eo            ria
                                        nc                          ba
                                          la           np
                                             ss           ro            ct
                                                ifi            te           er
                                                    ed            ob           ia
                                                         Pr           ac
                                                            ot            te
                                                                eo            ria
                                                                    ba
                                                         Cy             ct
                                                             an             er
                                                                  ob           ia
                                                                      ac
                                                             Ch           te
                                                                              ria
                                                                  la
                                                                     m
                                                          Ac             yd
                                                               id            ia
                                                                  ob            e
                                                          Ba act
                                                               ct           er
                                                                  er           ia
                                                         Ac          oi
                                                                         de
                                                             tin             te
                                                                  ob            s
                                                                      ac
                                                                                                    gene families


                                                                          te
                                                                              ria
                                                                  Aq
                                                       Pl             ui
                                                          an              fic
                                                              ct
                                                                 om ae
                                                                      yc
                                                           Sp              et



AMPHORA - each read on its own tree
                                                                iro            es
                                                                     ch
                                                                         ae
                                                                Fi           te
                                                                   rm           s
                                                                        ic
                                                                           ut
                                                                                                    Improves with more




                                                                Ch            es
                                                                    lo
                                                                       ro
                                              U                            fle
                                                nc                             xi
                                                     la            Ch
                                                        ss              lo
                                                           ifi              ro
                                                               ed              bi
                                                                    Ba
                                                                        ct
                                                                            er
                                                                               ia
                                                                                                                                     Phylogenetic Binning Using AMPHORA
                                      frr




                                      tsf
                                      pgk




                                      rplL
                                      rplF




                                      rplP

                                      rplT
                                      rplE
                                      infC




                                      rpsI
                                      rplS
                                      rplA
                                      rplB




                                      rplK
                                      rplC




                                      rpsJ
                                      rplN
                                      rplD




                                      rplM




                                      rpsE




                                      rpsS
                                      rpsB




                                      rpsK
                                      rpsC
                                      rpoB




                                      rpsM
                                      pyrG
                                      nusA
                                      dnaG




                                      rpmA




                                      smpB
Identifying new markers

• Take all genomes
• All vs. all search
• Identify protein families
• For each family measure
  –Evenness in copy number
  –Universality
  –Phylogenetic congruence with WGT
  –Monophyly for superfamilies
Distances between gene trees and the AMPHORA concatenated genome tree
   rpmA                                                                                          coaE
    coaE                                                                                        rpmA
   trmD                                                                                            rplL
    rpsS                                                                                         rpsQ
    radA                                                                                          rplR
     rplD                                                                                         rplQ
       tsf                                                                                       rpsH
       frr                                                                                      smpB
        ttf                                                                                      rpsO
     rplR                                                                                          rplP
     rplM                                                                                        rpsS
      rplI                                                                                        rplV
    rpsB                                                                                           rplT
    rpsO                                                                                          rplO
  mraW                                                                                            rpsP
    rpsH                                                                                         rpsK
     rplQ                                                                                         rplU
      rplL                                                                                           tsf
      rplT                                                                                      trmD
     rplE                                                                                         rplS
     rpsP                                                                                            ttf
     rplC                                                                                         rpsI
     rplV                                                                                       mraW
     rplS                                                                                         rpsL
     infC                                                                                        rpsG
    rpsM                                                                                          rplM
     rplO                                                                                           rplI
     rplU                                                                                        pyrH
     rpsL                                                                                        rpsM
    rpsQ                                                                                         ruvA
   guaA                                                                                          radA
    rpsG                                                                                         purA
   smpB                                                                                           rplK
     priA                                                                                         rplD
    rpsK                                                                                           infC
     rplK                                                                                         rplC
    serS                                                                                           rplE
     rplA                                                                                         rplA
      rplF                                                                                           frr
    ruvA                                                                                           rplF
    rpsC                                                                                         serS
     rplN                                                                                         rplN
      rplP                                                                                      guaA
    rpsE                                                                                         ruvB
    pyrH                                                                                         rpsB
     rpsI                                                                                         rpsJ
    secY                                                                                     rRNA16S
     rpsJ                                                                                        secY
    purA                                                                                          rplB
     rplB                                                                                         priA
    nusA                                                                                         rpsE
    ruvB                                                                                         rpsC
rRNA16S                                                                                          nusA
              0           1          2          3            4           5           6                       0   0.1    0.2     0.3   0.4       0.5   0.6    0.7    0.8   0.9

                           NODAL distance                                                                                                   SPLIT distance

                  AMPHORA marker         Ribosomal protein       Transcription/translation related protein       DNA repair protein     Protein of other function

                  Distance between the genome tree and 100 random trees (average ± standard deviation)
Identifying new phylogenetic
       markers within phyla
• Take all genomes within a phylum
• All vs. all search
• Identify protein families
• For each family measure
 –Evenness in copy number
 –Universality
 –Phylogenetic congruence with WGT
 –Monophyly for superfamilies
Keep only the families with:
Universality * Evenness * monophyly >= 90*90*90

      Phylogenetic group      Genome Number   Gene Number   Maker Candidates


      Archaea                 62              145415        102

      Actinobacteria          63              267783        136

      Alphaproteobacteria     94              347287        142

      Betaproteobacteria      56              266362        294

      Gammaproteobacteria     126             483632        141

      Deltaproteobacteria     25              102115        44

      Epislonproteobacteria   18              33416         446

      Bacteriodes             25              71531         179

      Chlamydae               13              13823         561

      Chloroflexi             10              33577         140

      Cyanobacteria           36              124080        532

      Firmicutes              106             312309        80

      Spirochaetes            18              38832         72

      Thermi                  5               14160         727

      Thermotogae             9               17037         646
Al
                                                   ph
                                                       ap
                                                         ro
                                                  Be           te
                                                      ta          ob
                                           G




                                                                                    0
                                                                                        0.1
                                                                                              0.2
                                                                                                    0.3
                                                                                                          0.4
                                                                                                                  0.5
                                                                                                                           0.6
                                                                                                                                 0.7
                                                         pr           ac
                                              am            ot            te
                                                   m            eo            ria
                                                       ap           ba
                                                          ro            ct
                                               D               te           er
                                                  el              ob           ia
                                                      ta
                                                         pr           ac
                                            Ep              ot            te
                                      U         si
                                                    lo          eo            ria
                                        nc                          ba
                                          la           np
                                             ss           ro            ct
                                                ifi            te           er
                                                    ed            ob           ia
                                                         Pr           ac
                                                            ot            te
                                                                eo            ria
                                                                    ba
                                                         Cy             ct
                                                             an             er
                                                                  ob           ia
                                                                      ac
                                                             Ch           te
                                                                              ria
                                                                  la
                                                                     m
                                                          Ac             yd
                                                               id            ia
                                                                  ob            e
                                                          Ba act
                                                               ct           er
                                                                  er           ia
                                                         Ac          oi
                                                                         de
                                                             tin             te
                                                                  ob            s
                                                                                                            Other needs?



                                                                      ac
                                                                          te
                                                                              ria
                                                                  Aq
                                                       Pl             ui
                                                          an              fic
                                                              ct
                                                                 om ae
                                                                      yc
                                                           Sp              et



AMPHORA - each read on its own tree
                                                                iro            es
                                                                     ch
                                                                         ae
                                                                Fi           te
                                                                   rm           s
                                                                        ic
                                                                           ut
                                                                Ch            es
                                                                    lo
                                                                       ro
                                              U                            fle
                                                nc                             xi
                                                     la            Ch
                                                        ss              lo
                                                           ifi              ro
                                                               ed              bi
                                                                    Ba
                                                                        ct
                                                                            er
                                                                               ia
                                                                                                                                       Phylogenetic Binning Using AMPHORA
                                      frr




                                      tsf
                                      pgk




                                      rplL
                                      rplF




                                      rplP

                                      rplT
                                      rplE
                                      infC




                                      rpsI
                                      rplS
                                      rplA
                                      rplB




                                      rplK
                                      rplC




                                      rpsJ
                                      rplN
                                      rplD




                                      rplM




                                      rpsE




                                      rpsS
                                      rpsB




                                      rpsK
                                      rpsC
                                      rpoB




                                      rpsM
                                      pyrG
                                      nusA
                                      dnaG




                                      rpmA




                                      smpB
Other Ways to Make Better Use
         of GEBA Data
• Rebuild protein family models
• Experiments from across the tree needed
• Need better phylogenies, including HGT
• Improved tools for using distantly related
  genomes in metagenomic analysis
• Better recording and sharing of metadata
  about organisms
GEBA Future 2

The dark matter of the biological
           universe
rRNA Tree of Life
Bacteria




                                  Archaea




 Eukaryotes

   FIgure from Barton, Eisen et al.
      “Evolution”, CSHL Press.
  Based on tree from Pace NR, 2003.
Phylogenetic Diversity:
Sequenced Bacteria & Archaea




          From Wu et al. 2009
Phylogenetic Diversity with
         GEBA




          From Wu et al. 2009
Phylogenetic Diversity: Isolates




            From Wu et al. 2009
Phylogenetic Diversity: All




          From Wu et al. 2009
Fantasy analysis of # PFAMs
                               GEBA Genomes
                               PD/Genome
                               ~0.1

                               PFAMs/Genome
                               ~1000

                               PFAMs/PD
                               ~10000

                               Total PFAMS
                               ~10,000,000


         From Wu et al. 2009
Conclusions

• Sequencing phylogenetically novel genomes
  has many benefits
• To obtain the most benefits, we need to change
  and adapt: computationally and
  experimentally
• Most of the phylogenetic diversity of microbes
  remains to be sampled
• Long live the Lake Arrowhead Microbial
  Genomes meeting
MICROBES
Proteobacteria
• GEBA              TM6
                    OS-K                    • At least 40 phyla
                    Acidobacteria
• A genomic         Termite Group
                    OP8
                                              of bacteria
  encyclopedia      Nitrospira
                    Bacteroides             • Genome
                    Chlorobi
  of bacteria and   Fibrobacteres
                    Marine GroupA             sequences are
  archaea           WS3
                    Gemmimonas                mostly from
                    Firmicutes
                    Fusobacteria
                    Actinobacteria
                                              three phyla
                    OP9
                    Cyanobacteria
                    Synergistes
                                            • Some other
                    Deferribacteres
                    Chrysiogenetes            phyla are only
                    NKB19
                    Verrucomicrobia
                    Chlamydia                 sparsely sampled
                    OP3
                    Planctomycetes
                    Spriochaetes            • Solution: Really
                    Coprothmermobacter
                    OP10
                    Thermomicrobia
                                              Fill in the Tree
                    Chloroflexi
                    TM7
                    Deinococcus-Thermus
                    Dictyoglomus
                    Aquificae
Eisen & Ward, PIs   Thermudesulfobacteria
                    Thermotogae
                    OP1
                    OP11
Thanks
     Institutions                                       $$$$
     JGI etc                                            DOE
     UC Davis                                           NSF
     DSMZ                                               GBMF
     TIGR




People
Dongying Wu
Phil Hugenholtz
Nikos Kyrpides       FIgure from Barton, Eisen et al.
Hans-Peter Klenk        “Evolution”, CSHL Press.

Eddy Rubin          Based on tree from Pace NR, 2003.

Más contenido relacionado

La actualidad más candente

Sero and phage typing bls 206
Sero and phage typing bls 206Sero and phage typing bls 206
Sero and phage typing bls 206Bruno Mmassy
 
Bacterial virus (Bacteriophage)
Bacterial virus (Bacteriophage)Bacterial virus (Bacteriophage)
Bacterial virus (Bacteriophage)ahmedadel697
 
Isolation of Bacteriophage
Isolation of Bacteriophage Isolation of Bacteriophage
Isolation of Bacteriophage Amani Riyadh
 
Bio305 2012 Lecture 1 on E. coli
Bio305 2012 Lecture 1 on E. coliBio305 2012 Lecture 1 on E. coli
Bio305 2012 Lecture 1 on E. coliMark Pallen
 
Biology exam iv for dec 9-2013 monday
Biology exam iv for dec 9-2013 mondayBiology exam iv for dec 9-2013 monday
Biology exam iv for dec 9-2013 monday1slid
 
Bacteriophages & Its classification, cycles, therapy, and applications
Bacteriophages & Its classification, cycles, therapy, and applicationsBacteriophages & Its classification, cycles, therapy, and applications
Bacteriophages & Its classification, cycles, therapy, and applicationsZoqiaTariq
 
Spore forming
Spore formingSpore forming
Spore formingSamer Bio
 
Non sporing anaerobes by rk taram
Non sporing anaerobes by rk taramNon sporing anaerobes by rk taram
Non sporing anaerobes by rk taramRanjeettaram
 
Anaerobic Bacteriology Lecture
Anaerobic  Bacteriology LectureAnaerobic  Bacteriology Lecture
Anaerobic Bacteriology LectureMD Specialclass
 
Classification of Bacteria microbiology
Classification of Bacteria microbiologyClassification of Bacteria microbiology
Classification of Bacteria microbiologyVinay Dhiman
 

La actualidad más candente (20)

Sero and phage typing bls 206
Sero and phage typing bls 206Sero and phage typing bls 206
Sero and phage typing bls 206
 
Bacterial virus (Bacteriophage)
Bacterial virus (Bacteriophage)Bacterial virus (Bacteriophage)
Bacterial virus (Bacteriophage)
 
Isolation of Bacteriophage
Isolation of Bacteriophage Isolation of Bacteriophage
Isolation of Bacteriophage
 
Bio305 2012 Lecture 1 on E. coli
Bio305 2012 Lecture 1 on E. coliBio305 2012 Lecture 1 on E. coli
Bio305 2012 Lecture 1 on E. coli
 
Biology exam iv for dec 9-2013 monday
Biology exam iv for dec 9-2013 mondayBiology exam iv for dec 9-2013 monday
Biology exam iv for dec 9-2013 monday
 
Bacteriophage
BacteriophageBacteriophage
Bacteriophage
 
Bacillus subtilis
Bacillus subtilisBacillus subtilis
Bacillus subtilis
 
Bacillus subtilis
Bacillus subtilisBacillus subtilis
Bacillus subtilis
 
Discovery of bacteriophage
Discovery of bacteriophageDiscovery of bacteriophage
Discovery of bacteriophage
 
Biology homework help
Biology homework helpBiology homework help
Biology homework help
 
Bacteriophage
BacteriophageBacteriophage
Bacteriophage
 
Veillonella ppt
Veillonella pptVeillonella ppt
Veillonella ppt
 
Bacterial spore
Bacterial sporeBacterial spore
Bacterial spore
 
Bacteriophages & Its classification, cycles, therapy, and applications
Bacteriophages & Its classification, cycles, therapy, and applicationsBacteriophages & Its classification, cycles, therapy, and applications
Bacteriophages & Its classification, cycles, therapy, and applications
 
Spore forming
Spore formingSpore forming
Spore forming
 
Non sporing anaerobes by rk taram
Non sporing anaerobes by rk taramNon sporing anaerobes by rk taram
Non sporing anaerobes by rk taram
 
Anaerobic Bacteriology Lecture
Anaerobic  Bacteriology LectureAnaerobic  Bacteriology Lecture
Anaerobic Bacteriology Lecture
 
Chapter 13 viruse
Chapter 13 viruseChapter 13 viruse
Chapter 13 viruse
 
Staphylococcus
StaphylococcusStaphylococcus
Staphylococcus
 
Classification of Bacteria microbiology
Classification of Bacteria microbiologyClassification of Bacteria microbiology
Classification of Bacteria microbiology
 

Destacado

cardinal health Q2 2008 Earnings Presentation
cardinal health Q2 2008 Earnings Presentationcardinal health Q2 2008 Earnings Presentation
cardinal health Q2 2008 Earnings Presentationfinance2
 
morgan stanley November 2007 Morgan Stanley & Co. Incorporated
morgan stanley November 2007 Morgan Stanley & Co. Incorporatedmorgan stanley November 2007 Morgan Stanley & Co. Incorporated
morgan stanley November 2007 Morgan Stanley & Co. Incorporatedfinance2
 
D. Vulcanov: Symbolic Computation Methods in Cosmology and General Relativity...
D. Vulcanov: Symbolic Computation Methods in Cosmology and General Relativity...D. Vulcanov: Symbolic Computation Methods in Cosmology and General Relativity...
D. Vulcanov: Symbolic Computation Methods in Cosmology and General Relativity...SEENET-MTP
 
Mekesson Quarterly Reports 2008 2nd
Mekesson Quarterly Reports 2008 2ndMekesson Quarterly Reports 2008 2nd
Mekesson Quarterly Reports 2008 2ndfinance2
 
B. Sazdovic - Noncommutativity and T-duality
B. Sazdovic - Noncommutativity and T-dualityB. Sazdovic - Noncommutativity and T-duality
B. Sazdovic - Noncommutativity and T-dualitySEENET-MTP
 
Vietnam conflict day 2
 Vietnam conflict day 2 Vietnam conflict day 2
Vietnam conflict day 2Krista Leh
 

Destacado (6)

cardinal health Q2 2008 Earnings Presentation
cardinal health Q2 2008 Earnings Presentationcardinal health Q2 2008 Earnings Presentation
cardinal health Q2 2008 Earnings Presentation
 
morgan stanley November 2007 Morgan Stanley & Co. Incorporated
morgan stanley November 2007 Morgan Stanley & Co. Incorporatedmorgan stanley November 2007 Morgan Stanley & Co. Incorporated
morgan stanley November 2007 Morgan Stanley & Co. Incorporated
 
D. Vulcanov: Symbolic Computation Methods in Cosmology and General Relativity...
D. Vulcanov: Symbolic Computation Methods in Cosmology and General Relativity...D. Vulcanov: Symbolic Computation Methods in Cosmology and General Relativity...
D. Vulcanov: Symbolic Computation Methods in Cosmology and General Relativity...
 
Mekesson Quarterly Reports 2008 2nd
Mekesson Quarterly Reports 2008 2ndMekesson Quarterly Reports 2008 2nd
Mekesson Quarterly Reports 2008 2nd
 
B. Sazdovic - Noncommutativity and T-duality
B. Sazdovic - Noncommutativity and T-dualityB. Sazdovic - Noncommutativity and T-duality
B. Sazdovic - Noncommutativity and T-duality
 
Vietnam conflict day 2
 Vietnam conflict day 2 Vietnam conflict day 2
Vietnam conflict day 2
 

Similar a Eisen.lake arrowhead2010c

Jonathan Eisen talk at ASM General Meeting 2010
Jonathan Eisen talk at ASM General Meeting 2010Jonathan Eisen talk at ASM General Meeting 2010
Jonathan Eisen talk at ASM General Meeting 2010Jonathan Eisen
 
Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project
Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" projectTalk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project
Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" projectJonathan Eisen
 
Biology exam iv for dec 9-2013 monday [self quizzes] [all lecture notes]
Biology exam iv for dec 9-2013 monday [self quizzes] [all lecture notes]Biology exam iv for dec 9-2013 monday [self quizzes] [all lecture notes]
Biology exam iv for dec 9-2013 monday [self quizzes] [all lecture notes]1slid
 
Biological classification 11 biology
Biological classification 11 biologyBiological classification 11 biology
Biological classification 11 biologyRam Mohan
 
Kingdom monera characteristics
Kingdom monera characteristicsKingdom monera characteristics
Kingdom monera characteristicsJessi Dildy
 
Biol102 chp27-pp-spr10-100402104900-phpapp02
Biol102 chp27-pp-spr10-100402104900-phpapp02Biol102 chp27-pp-spr10-100402104900-phpapp02
Biol102 chp27-pp-spr10-100402104900-phpapp02Cleophas Rwemera
 
Biol102 chp27-pp-spr10-100402104900-phpapp02
Biol102 chp27-pp-spr10-100402104900-phpapp02Biol102 chp27-pp-spr10-100402104900-phpapp02
Biol102 chp27-pp-spr10-100402104900-phpapp02Cleophas Rwemera
 
4. bacterial classification
4. bacterial classification4. bacterial classification
4. bacterial classificationKHAFAT MEDICAL
 
Biology exam iv for dec 9-2013 monday
Biology exam iv for dec 9-2013 mondayBiology exam iv for dec 9-2013 monday
Biology exam iv for dec 9-2013 monday1slid
 
Kingdom Protista presentation
Kingdom Protista presentationKingdom Protista presentation
Kingdom Protista presentationAnzaDar3
 
MorrillMicrobeMadness_2022.pdf
MorrillMicrobeMadness_2022.pdfMorrillMicrobeMadness_2022.pdf
MorrillMicrobeMadness_2022.pdfKristen DeAngelis
 
Myxozoan fish parasite life cycle
Myxozoan fish parasite life cycleMyxozoan fish parasite life cycle
Myxozoan fish parasite life cycleanishavalsalam
 
CLASSIFICATION OF BACTERIA
CLASSIFICATION OF BACTERIACLASSIFICATION OF BACTERIA
CLASSIFICATION OF BACTERIAChaUhan Ar Shi
 

Similar a Eisen.lake arrowhead2010c (20)

Jonathan Eisen talk at ASM General Meeting 2010
Jonathan Eisen talk at ASM General Meeting 2010Jonathan Eisen talk at ASM General Meeting 2010
Jonathan Eisen talk at ASM General Meeting 2010
 
Eisen.Csb2009
Eisen.Csb2009Eisen.Csb2009
Eisen.Csb2009
 
Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project
Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" projectTalk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project
Talk by J. Eisen at ASBMB on "Phylogeny driven genomic encyclopedia" project
 
Biology exam iv for dec 9-2013 monday [self quizzes] [all lecture notes]
Biology exam iv for dec 9-2013 monday [self quizzes] [all lecture notes]Biology exam iv for dec 9-2013 monday [self quizzes] [all lecture notes]
Biology exam iv for dec 9-2013 monday [self quizzes] [all lecture notes]
 
Anaerobic bacteria
Anaerobic bacteriaAnaerobic bacteria
Anaerobic bacteria
 
Lecture 08 (3 2-2021) rares
Lecture 08 (3 2-2021) raresLecture 08 (3 2-2021) rares
Lecture 08 (3 2-2021) rares
 
Biological classification 11 biology
Biological classification 11 biologyBiological classification 11 biology
Biological classification 11 biology
 
Kingdom monera characteristics
Kingdom monera characteristicsKingdom monera characteristics
Kingdom monera characteristics
 
Biol102 chp27-pp-spr10-100402104900-phpapp02
Biol102 chp27-pp-spr10-100402104900-phpapp02Biol102 chp27-pp-spr10-100402104900-phpapp02
Biol102 chp27-pp-spr10-100402104900-phpapp02
 
Biol102 chp27-pp-spr10-100402104900-phpapp02
Biol102 chp27-pp-spr10-100402104900-phpapp02Biol102 chp27-pp-spr10-100402104900-phpapp02
Biol102 chp27-pp-spr10-100402104900-phpapp02
 
4. bacterial classification
4. bacterial classification4. bacterial classification
4. bacterial classification
 
Biology exam iv for dec 9-2013 monday
Biology exam iv for dec 9-2013 mondayBiology exam iv for dec 9-2013 monday
Biology exam iv for dec 9-2013 monday
 
Bergeys mannual
Bergeys mannualBergeys mannual
Bergeys mannual
 
Eisen.Geba.Jgi2009b
Eisen.Geba.Jgi2009bEisen.Geba.Jgi2009b
Eisen.Geba.Jgi2009b
 
Kingdom Protista presentation
Kingdom Protista presentationKingdom Protista presentation
Kingdom Protista presentation
 
MorrillMicrobeMadness_2022.pdf
MorrillMicrobeMadness_2022.pdfMorrillMicrobeMadness_2022.pdf
MorrillMicrobeMadness_2022.pdf
 
Monera
MoneraMonera
Monera
 
Myxozoan fish parasite life cycle
Myxozoan fish parasite life cycleMyxozoan fish parasite life cycle
Myxozoan fish parasite life cycle
 
Kingdoms.ppt
Kingdoms.pptKingdoms.ppt
Kingdoms.ppt
 
CLASSIFICATION OF BACTERIA
CLASSIFICATION OF BACTERIACLASSIFICATION OF BACTERIA
CLASSIFICATION OF BACTERIA
 

Más de Jonathan Eisen

Eisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdfEisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdfJonathan Eisen
 
Phylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of MicrobesPhylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of MicrobesJonathan Eisen
 
Talk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meetingTalk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meetingJonathan Eisen
 
Thoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current ActionsThoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current ActionsJonathan Eisen
 
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Jonathan Eisen
 
A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2Jonathan Eisen
 
EVE198 Summer Session Class 4
EVE198 Summer Session Class 4EVE198 Summer Session Class 4
EVE198 Summer Session Class 4Jonathan Eisen
 
EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1 EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1 Jonathan Eisen
 
EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines Jonathan Eisen
 
EVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 IntroductionEVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 IntroductionJonathan Eisen
 
EVE198 Spring2021 Class2
EVE198 Spring2021 Class2EVE198 Spring2021 Class2
EVE198 Spring2021 Class2Jonathan Eisen
 
EVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 VaccinesEVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 VaccinesJonathan Eisen
 
EVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA DetectionEVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA DetectionJonathan Eisen
 
EVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 IntroductionEVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 IntroductionJonathan Eisen
 
EVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID TestingEVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID TestingJonathan Eisen
 
EVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID VaccinesEVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID VaccinesJonathan Eisen
 
EVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID TransmissionEVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID TransmissionJonathan Eisen
 
EVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 VaccinesEVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 VaccinesJonathan Eisen
 
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and TestingEVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and TestingJonathan Eisen
 
EVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 IntroductionEVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 IntroductionJonathan Eisen
 

Más de Jonathan Eisen (20)

Eisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdfEisen.CentralValley2024.pdf
Eisen.CentralValley2024.pdf
 
Phylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of MicrobesPhylogenomics and the Diversity and Diversification of Microbes
Phylogenomics and the Diversity and Diversification of Microbes
 
Talk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meetingTalk by Jonathan Eisen for LAMG2022 meeting
Talk by Jonathan Eisen for LAMG2022 meeting
 
Thoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current ActionsThoughts on UC Davis' COVID Current Actions
Thoughts on UC Davis' COVID Current Actions
 
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
Phylogenetic and Phylogenomic Approaches to the Study of Microbes and Microbi...
 
A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2A Field Guide to Sars-CoV-2
A Field Guide to Sars-CoV-2
 
EVE198 Summer Session Class 4
EVE198 Summer Session Class 4EVE198 Summer Session Class 4
EVE198 Summer Session Class 4
 
EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1 EVE198 Summer Session 2 Class 1
EVE198 Summer Session 2 Class 1
 
EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines EVE198 Summer Session 2 Class 2 Vaccines
EVE198 Summer Session 2 Class 2 Vaccines
 
EVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 IntroductionEVE198 Spring2021 Class1 Introduction
EVE198 Spring2021 Class1 Introduction
 
EVE198 Spring2021 Class2
EVE198 Spring2021 Class2EVE198 Spring2021 Class2
EVE198 Spring2021 Class2
 
EVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 VaccinesEVE198 Spring2021 Class5 Vaccines
EVE198 Spring2021 Class5 Vaccines
 
EVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA DetectionEVE198 Winter2020 Class 8 - COVID RNA Detection
EVE198 Winter2020 Class 8 - COVID RNA Detection
 
EVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 IntroductionEVE198 Winter2020 Class 1 Introduction
EVE198 Winter2020 Class 1 Introduction
 
EVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID TestingEVE198 Winter2020 Class 3 - COVID Testing
EVE198 Winter2020 Class 3 - COVID Testing
 
EVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID VaccinesEVE198 Winter2020 Class 5 - COVID Vaccines
EVE198 Winter2020 Class 5 - COVID Vaccines
 
EVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID TransmissionEVE198 Winter2020 Class 9 - COVID Transmission
EVE198 Winter2020 Class 9 - COVID Transmission
 
EVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 VaccinesEVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
EVE198 Fall2020 "Covid Mass Testing" Class 8 Vaccines
 
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and TestingEVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
EVE198 Fall2020 "Covid Mass Testing" Class 2: Viruses, COIVD and Testing
 
EVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 IntroductionEVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
EVE198 Fall2020 "Covid Mass Testing" Class 1 Introduction
 

Eisen.lake arrowhead2010c

  • 1. The Importance of History (and other obsessions) Jonathan A. Eisen UC Davis Talk for Lake Arrowhead Microbial Genomes 2010 (#LAMG10)
  • 2.
  • 5. Evolution of Lake Arrowhead
  • 7.
  • 8.
  • 9.
  • 10. Homework • Do blastp search with other famous people associated with Lake Arrowhead Meeting • JEFFREYHMILLER • SARAHPALIN and her relationship to fungi B. fuckeliana • see http://phylogenomics.blogspot.com/ 2008/09/tracing-evolutionary-history-of- sarah.html
  • 11. 2010
  • 12. 2008
  • 13. 2006
  • 14. 2004
  • 17. 2002
  • 18.
  • 19. Quotes 2004 • Space-time continuum of genes and genomes • Gene sequences are the wormhole that allows one to tunnel into the past • The human mind can conceive of things with no basis in physical reality • Thoughts can go faster than the speed of light
  • 20.
  • 21. Quotes 2006 • The human guts are a real milieu of stuff • You better kiss everybody • Microbes not only have a lot of sex, they have a lot of weird sex • This is how you do metagenomics on 50 dollars, and that’s Canadian dollars
  • 22. Quotes 2008 • Antibiotics do not kill things, they corrupt them • There comes a point in life when you have to bring chemists into the picture • The rectal swabs are here in tan color • And there's Jeffrey Dahmer • We are the environment. We live the phenotype. • If I have time I will tell you about a dream • A paper came out next year
  • 23. Quotes 2010 • We have been using this word for many years without actually realizing it was correct • Another thing you need to know" pause "Actually you don't NEED to know any of this • "I have been influenced by Fisher Price throughout my life • Don't take that away from us • It takes 1000 nanobiologists to make one microbiologist • I am going to wrap up as I hear the crickets chirping • And we will bring out the unused cheese from yesterday • In an engineering sense, the vagina is a simple plug flow reactor • This is going to be ironic coming from someone who studies circumcision • A little bit about time, but I am going to spend a lot less time on time than on space
  • 24. Keywords I remember from 2010 • Penis • Vagina • Anthrax • Acne • Ulcer (multiple kinds) • Global warming • Antibiotic resistance • Virulence 24
  • 25.
  • 26.
  • 27. rRNA Tree of Life Bacteria Archaea Eukaryotes FIgure from Barton, Eisen et al. “Evolution”, CSHL Press. Based on tree from Pace NR, 2003.
  • 28. Proteobacteria 2002 TM6 OS-K Acidobacteria • At least 40 Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA WS3 Gemmimonas Firmicutes Fusobacteria Actinobacteria OP9 Cyanobacteria Synergistes Deferribacteres Chrysiogenetes NKB19 Verrucomicrobia Chlamydia OP3 Planctomycetes Spriochaetes Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on Hugenholtz, OP11 2002
  • 29. 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Genome WS3 Gemmimonas sequences are Firmicutes Fusobacteria mostly from Actinobacteria OP9 Cyanobacteria three phyla Synergistes Deferribacteres Chrysiogenetes NKB19 Verrucomicrobia Chlamydia OP3 Planctomycetes Spriochaetes Coprothmermobacter OP10 Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on Hugenholtz, OP11 2002
  • 30. 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Genome WS3 Gemmimonas sequences are Firmicutes Fusobacteria mostly from Actinobacteria OP9 Cyanobacteria three phyla Synergistes Deferribacteres Chrysiogenetes • Some other NKB19 Verrucomicrobia Chlamydia phyla are only OP3 Planctomycetes Spriochaetes sparsely Coprothmermobacter OP10 sampled Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on Hugenholtz, OP11 2002
  • 31. 2002 Proteobacteria TM6 OS-K • At least 40 Acidobacteria Termite Group OP8 phyla of Nitrospira Bacteroides bacteria Chlorobi Fibrobacteres Marine GroupA • Genome WS3 Gemmimonas sequences are Firmicutes Fusobacteria mostly from Actinobacteria OP9 Cyanobacteria three phyla Synergistes Deferribacteres Chrysiogenetes • Some other NKB19 Verrucomicrobia Chlamydia phyla are only OP3 Planctomycetes Spriochaetes sparsely Coprothmermobacter OP10 sampled Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Thermudesulfobacteria Thermotogae OP1 Based on Hugenholtz, OP11 2002
  • 32. Why Increase Phylogenetic Coverage? • Common approach within some eukaryotic groups (FGP, NHGRI, etc) • Many successful small projects to fill in bacterial or archaeal gaps • Phylogenetic gaps in bacterial and archaeal projects commonly lamented in literature • Many potential benefits
  • 33. Proteobacteria • NSF-funded TM6 • At least 40 phyla OS-K Tree of Life Acidobacteria Termite Group of bacteria OP8 Project Nitrospira • Genome Bacteroides Chlorobi • A genome Fibrobacteres Marine GroupA sequences are from each of WS3 Gemmimonas mostly from eight phyla Firmicutes Fusobacteria three phyla Actinobacteria OP9 Cyanobacteria Synergistes • Some other Deferribacteres Chrysiogenetes phyla are only NKB19 Verrucomicrobia Chlamydia sparsely sampled OP3 Planctomycetes Spriochaetes • Solution I: Coprothmermobacter OP10 sequence more Thermomicrobia Chloroflexi TM7 phyla Deinococcus-Thermus Dictyoglomus Aquificae Eisen & Ward, PIs Thermudesulfobacteria Thermotogae OP1 OP11
  • 34.
  • 35. Proteobacteria • NSF-funded TM6 • At least 40 phyla OS-K Tree of Life Acidobacteria Termite Group of bacteria OP8 Project Nitrospira • Genome Bacteroides Chlorobi • A genome Fibrobacteres Marine GroupA sequences are from each of WS3 Gemmimonas mostly from eight phyla Firmicutes Fusobacteria three phyla Actinobacteria OP9 Cyanobacteria Synergistes • Some other Deferribacteres Chrysiogenetes phyla are only NKB19 Verrucomicrobia Chlamydia sparsely sampled OP3 Planctomycetes Spriochaetes • Still highly Coprothmermobacter OP10 biased in terms Thermomicrobia Chloroflexi TM7 of the tree Deinococcus-Thermus Dictyoglomus Aquificae Eisen & Ward, PIs Thermudesulfobacteria Thermotogae OP1 OP11
  • 36. Major Lineages of Actinobacteria 2.5 Actinobacteria 2.5.1 Acidimicrobidae 2.5.1 Acidimicrobidae 2.5.1.1 Unclassified 2.5.1.2 "Microthrixineae 2.5.1.1 Unclassified 2.5.1.3 Acidimicrobineae 2.5.1.3.1 Unclassified 2.5.1.2 "Microthrixineae 2.5.1.3.2 Acidimicrobiaceae 2.5.1.4 BD2-10 2.5.1.3 Acidimicrobineae 2.5.1.5 EB1017 2.5.2 Actinobacteridae 2.5.1.4 BD2-10 2.5.2.1 Unclassified 2.5.2.10 Ellin306/WR160 2.5.1.5 EB1017 2.5.2.11 Ellin5012 2.5.2.12 Ellin5034 2.5.2 Actinobacteridae 2.5.2.13 Frankineae 2.5.2.13.1 Unclassified 2.5.2.1 Unclassified 2.5.2.13.2 Acidothermaceae 2.5.2.13.3 Ellin6090 2.5.2.10 Ellin306/WR160 2.5.2.13.4 Frankiaceae 2.5.2.11 Ellin5012 2.5.2.13.5 2.5.2.13.6 Geodermatophilaceae Microsphaeraceae 2.5.2.12 Ellin5034 2.5.2.13.7 2.5.2.14 Sporichthyaceae Glycomyces 2.5.2.13 Frankineae 2.5.2.15 2.5.2.15.1 Intrasporangiaceae Unclassified 2.5.2.14 Glycomyces 2.5.2.15.2 2.5.2.15.3 Dermacoccus Intrasporangiaceae 2.5.2.15 Intrasporangiaceae 2.5.2.16 2.5.2.17 Kineosporiaceae Microbacteriaceae 2.5.2.16 Kineosporiaceae 2.5.2.17.1 2.5.2.17.2 Unclassified Agrococcus 2.5.2.17 Microbacteriaceae 2.5.2.17.3 2.5.2.18 Agromyces Micrococcaceae 2.5.2.18 Micrococcaceae 2.5.2.19 2.5.2.2 Micromonosporaceae Actinomyces 2.5.2.19 Micromonosporaceae 2.5.2.20 2.5.2.20.1 Propionibacterineae Unclassified 2.5.2.2 Actinomyces 2.5.2.20.2 2.5.2.20.3 Kribbella Nocardioidaceae 2.5.2.20 Propionibacterineae 2.5.2.20.4 2.5.2.21 Propionibacteriaceae Pseudonocardiaceae 2.5.2.21 Pseudonocardiaceae 2.5.2.22 2.5.2.22.1 Streptomycineae Unclassified 2.5.2.22 Streptomycineae 2.5.2.22.2 2.5.2.22.3 Kitasatospora Streptacidiphilus 2.5.2.23 Streptosporangineae 2.5.2.23 2.5.2.23.1 Streptosporangineae Unclassified 2.5.2.3 Actinomycineae 2.5.2.23.2 2.5.2.23.3 Ellin5129 Nocardiopsaceae 2.5.2.4 Actinosynnemataceae 2.5.2.23.4 2.5.2.23.5 Streptosporangiaceae Thermomonosporaceae 2.5.2.5 Bifidobacteriaceae 2.5.2.3 2.5.2.4 Actinomycineae Actinosynnemataceae 2.5.2.6 Brevibacteriaceae 2.5.2.5 Bifidobacteriaceae 2.5.2.6 Brevibacteriaceae 2.5.2.7 Cellulomonadaceae 2.5.2.7 Cellulomonadaceae 2.5.2.8 Corynebacterineae 2.5.2.8 Corynebacterineae 2.5.2.8.1 Unclassified 2.5.2.8.2 Corynebacteriaceae 2.5.2.9 Dermabacteraceae 2.5.2.8.3 Dietziaceae 2.5.2.8.4 Gordoniaceae 2.5.3 Coriobacteridae 2.5.2.8.5 Mycobacteriaceae 2.5.2.8.6 Rhodococcus 2.5.3.1 Unclassified 2.5.2.8.7 Rhodococcus 2.5.2.8.8 Rhodococcus 2.5.3.2 Atopobiales 2.5.2.9 Dermabacteraceae 2.5.2.9.1 Unclassified 2.5.3.3 Coriobacteriales 2.5.2.9.2 Brachybacterium 2.5.2.9.3 Dermabacter 2.5.3.4 Eggerthellales 2.5.3 Coriobacteridae 2.5.3.1 Unclassified 2.5.4 OPB41 2.5.3.2 Atopobiales 2.5.3.3 Coriobacteriales 2.5.5 PK1 2.5.3.4 Eggerthellales 2.5.4 OPB41 2.5.6 Rubrobacteridae 2.5.5 PK1 2.5.6 Rubrobacteridae 2.5.6.1 Unclassified 2.5.6.1 Unclassified 2.5.6.2 "Thermoleiphilaceae 2.5.6.2 "Thermoleiphilaceae 2.5.6.2.1 Unclassified 2.5.6.2.2 Conexibacter 2.5.6.3 MC47 2.5.6.2.3 XGE514 2.5.6.3 MC47 2.5.6.4 Rubrobacteraceae 2.5.6.4 Rubrobacteraceae
  • 37. Proteobacteria • NSF-funded TM6 • At least 40 phyla OS-K Tree of Life Acidobacteria Termite Group of bacteria OP8 Project Nitrospira • Genome Bacteroides Chlorobi • A genome Fibrobacteres Marine GroupA sequences are from each of WS3 Gemmimonas mostly from eight phyla Firmicutes Fusobacteria three phyla Actinobacteria OP9 Cyanobacteria Synergistes • Some other Deferribacteres Chrysiogenetes phyla are only NKB19 Verrucomicrobia Chlamydia sparsely sampled OP3 Planctomycetes Spriochaetes • Same trend in Coprothmermobacter OP10 Archaea Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Eisen & Ward, PIs Thermudesulfobacteria Thermotogae OP1 OP11
  • 38. Proteobacteria • NSF-funded TM6 • At least 40 phyla OS-K Tree of Life Acidobacteria Termite Group of bacteria OP8 Project Nitrospira • Genome Bacteroides Chlorobi • A genome Fibrobacteres Marine GroupA sequences are from each of WS3 Gemmimonas mostly from eight phyla Firmicutes Fusobacteria three phyla Actinobacteria OP9 Cyanobacteria Synergistes • Some other Deferribacteres Chrysiogenetes phyla are only NKB19 Verrucomicrobia Chlamydia sparsely sampled OP3 Planctomycetes Spriochaetes • Same trend in Coprothmermobacter OP10 Eukaryotes Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Eisen & Ward, PIs Thermudesulfobacteria Thermotogae OP1 OP11
  • 39. Proteobacteria • NSF-funded TM6 • At least 40 phyla OS-K Tree of Life Acidobacteria Termite Group of bacteria OP8 Project Nitrospira • Genome Bacteroides Chlorobi • A genome Fibrobacteres Marine GroupA sequences are from each of WS3 Gemmimonas mostly from eight phyla Firmicutes Fusobacteria three phyla Actinobacteria OP9 Cyanobacteria Synergistes • Some other Deferribacteres Chrysiogenetes phyla are only NKB19 Verrucomicrobia Chlamydia sparsely sampled OP3 Planctomycetes Spriochaetes • Same trend in Coprothmermobacter OP10 Viruses Thermomicrobia Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Eisen & Ward, PIs Thermudesulfobacteria Thermotogae OP1 OP11
  • 40. Proteobacteria • GEBA TM6 OS-K • At least 40 phyla Acidobacteria • A genomic Termite Group OP8 of bacteria encyclopedia Nitrospira Bacteroides • Genome Chlorobi of bacteria and Fibrobacteres Marine GroupA sequences are archaea WS3 Gemmimonas mostly from Firmicutes Fusobacteria Actinobacteria three phyla OP9 Cyanobacteria Synergistes • Some other Deferribacteres Chrysiogenetes phyla are only NKB19 Verrucomicrobia Chlamydia sparsely sampled OP3 Planctomycetes Spriochaetes • Solution: Really Coprothmermobacter OP10 Thermomicrobia Fill in the Tree Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Eisen & Ward, PIs Thermudesulfobacteria Thermotogae OP1 OP11
  • 41. GEBA Pilot Project Overview • Identify major branches in rRNA tree for which no genomes are available • Identify those with a cultured representative in DSMZ • DSMZ grew > 200 of these and prepped DNA • Sequence and finish 100+ (covering breadth of bacterial/archaea diversity) • Annotate, analyze, release data • Assess benefits of tree guided sequencing • 1st paper Wu et al in Nature Dec 2009
  • 42. GEBA Pilot Project: Components • Project overview (Phil Hugenholtz, Nikos Kyrpides, Jonathan Eisen, Eddy Rubin, Jim Bristow, Tanya Woyke) • Project management (David Bruce, Eileen Dalin, Lynne Goodwin) • Culture collection and DNA prep (DSMZ, Hans-Peter Klenk) • Sequencing and closure (Eileen Dalin, Susan Lucas, Alla Lapidus, Mat Nolan, Alex Copeland, Cliff Han, Feng Chen, Jan-Fang Cheng) • Annotation and data release (Nikos Kyrpides, Victor Markowitz, et al) • Analysis (Dongying Wu, Kostas Mavrommatis, Martin Wu, Victor Kunin, Neil Rawlings, Ian Paulsen, Patrick Chain, Patrik D’Haeseleer, Sean Hooper, Iain Anderson, Amrita Pati, Natalia N. Ivanova, Athanasios Lykidis, Adam Zemla) • Adopt a microbe education project (Cheryl Kerfeld) • Outreach (David Gilbert) • $$$ (DOE, DSMZ, GBMF)
  • 43. GEBA and Openness • All data released as quickly as possible w/ no restrictions to IMG-GEBA; Genbank, etc • Data also available in Biotorrents (http:// biotorrents.net) • Individual genome reports published in OA “Standards in Genome Sciences (SIGS)” • 1st GEBA paper in Nature freely available and published using Creative Commons License 43
  • 44. GEBA Lesson 1 rRNA Tree is Useful for Identifying Phylogenetically Novel Organisms 44
  • 45. rRNA Tree of Life Bacteria Archaea Eukaryotes FIgure from Barton, Eisen et al. “Evolution”, CSHL Press. Based on tree from Pace NR, 2003.
  • 46. Network of Life? Bacteria Archaea Eukaryotes Figure from Barton, Eisen et al. “Evolution”, CSHL Press. Based on tree from Pace NR, 2003.
  • 47. Compare PD in rRNA and WGT
  • 48. PD of rRNA, Genome Trees Similar From Wu et al. 2009 Nature 462, 1056-1060
  • 49. GEBA Lesson 2 Phylogeny-driven genome selection helps discover new genetic diversity
  • 50. Network of Life? Bacteria Archaea Eukaryotes FIgure from Barton, Eisen et al. “Evolution”, CSHL Press. Based on tree from Pace NR, 2003.
  • 51. Protein Family Rarefaction Curves • Take data set of multiple complete genomes • Identify all protein families using MCL • Plot # of genomes vs. # of protein families
  • 52.
  • 53.
  • 54.
  • 55.
  • 56.
  • 58. Phylogenetic Distribution Novelty: Bacterial Actin Related Protein 2"#3)&4&*&& !"#*)$*),+% 5"#$-.-6&0&1- !"#$%,$-%)( 7"#0(1.8-9& !"#$''+-+,',! 5"#:1,)*&$/0 !"#&$,%+)+-+ ! " #$% !"#$%&'()*&& !"#$%&'(%() (( +"#,-.(/01 !"#*+,**'+( ;"#01,&-*0 !"#%*+$--( <"#$-.-3.1%&0 !"#%',&'-+) ') 2"#$&*-.-1 !"#$'(-%%+&$ ="#$.1001 !"#-*$+$(&( ! &’ ( $++ >"#0$1,/%1.&0 !"#&$**+),)-! *$ $++ ;"#01,&-*0 !"#*+,$*'( '* 5"#:1,)*&$/0 !"#&$,%+%-%% $++ 5"#$-.-6&0&1- !"#',&+$)* ! &’ ) ?"#@-%1*)A10(-. !"#&%'%&*%* $++ B"#A1%%/0# "#%*,-&*'( )* 2"#*-)').@1*0 !"#*-&'''(+ 5"#$-.-6&0&1- !"#',&&*&* ! &’ * $++ ?"#@-%1*)A10(-. !"#$)),)*%, $++ ;"#01,&-*0 !"#*+,$*),! ;"#)$C.1$-/@ !"#&&),(*((- + ! &’ 5"#$-.-6&0&1- !"#$++-&%%! ), ."#,1(-*0 !"#$'-+*$((&! ! &’ , (( !"#(C1%&1*1 !"#$-,(%'+-! (% 5"#$-.-6&0&1- !"#$,+$(,& $++ 5"#:1,)*&$/0 !"#&$,%+-,(,! ! &’ - -) ?"#4&0$)&4-/@ !"#''-+&%$- )% ?"#@-%1*)A10(-. !"#$)),),%) () 5"#$-.-6&0&1- !"#',&,$$% $++ ?"#C1*0-*&&!"#&$-*$ $(&$ ! &’ . $++ D"#01(&61 !"#$-&'*)%&+! !"#(C1%&1*1!"#$-%$ $),) ! &’ / ?"#@-%1*)A1(-. !"#$((&+,*- $++ <"#@/0$/%/0 !"#&&'&%'*(, ! &’ ( 0 +/*! Haliangium ochraceum DSM 14365 Patrik D’haeseleer, Adam Zemla, Victor Kunin See also Guljamow et al. 2007 Current Biology.
  • 59. GEBA Lesson 3 Phylogeny-driven genome selection improves genome annotation
  • 60. Most/All Functional Prediction Improves w/ Better Phylogenetic Sampling • Took 56 GEBA genomes and compared results vs. 56 randomly sampled new genomes • Better definition of protein family sequence “patterns” • Greatly improves “comparative” and “evolutionary” based predictions • Conversion of hypothetical into conserved hypotheticals • Linking distantly related members of protein families • Improved non-homology prediction Kostas Natalia Thanos Nikos Iain Mavrommatis Ivanova Lykidis Kyrpides Anderson
  • 61. GEBA Lesson 4 Metadata and individual genome papers important
  • 63. GEBA Lesson 5 Phylogeny-driven genome selection improves analysis of metagenome data
  • 64. genomes if no reference • Assigning reads to phylogenetic groups using multiple genes • Phylogenetic binning • Phylogenetic ecology - especially important Weighted % of Clones Al ph ap ro 0 0.1250 0.2500 0.3750 0.5000 Be te Al ta ob ph G 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 pr ac a am ot te m eo ria Be pro ap ba ro ct ta teo D te er G p b el ob ia ta am rot ac pr ac Ep ot te U si lo eo ria m eo te nc ba ba ria la np Ep ap ss ro ct ct ifi te er si rot ed ob ia lo Pr ac n eo eria ot te De pr ba eo ria ba lta ote cte Cy ct pr ob ria an er ob ia o a ac C teo cte Ch te ya b ri ria la no ac a m b te Ac yd id ia ob e Fi act ria rm er Ba act ct er ia Ac ic ia Uses of phylogenetic er ut Ac oi tin es de tin te ob ob s a ac te C cte ria hl ri Aq or a Pl ui an fic ob ct om ae C i yc FB Sp et C iro es hl ch o ae te Major Phylogenetic Group Fi Sp rof rm s ic iro lex i Sargasso Phylotypes ut classification in metagenomics Ch es Fu cha lo ro De U fle so ete nc xi in ba s la Ch oc ss lo ct ifi ro oc ed bi er Ba Ecus ia ct ur - er ia yaTh C rcherm re na aeous frr tsf t pgk rplL rplF rplP rplT rplE infC rpsI rplS rplA rplB rplK rplC rpsJ rc rplN rplD rplM rpsE rpsS rpsB rpsK rpsC rpoB rpsM pyrG nusA dnaG rpmA smpB ha a eo ta
  • 65. genomes if no reference phylogenetic groups using multiple genes Limited • Phylogenetic binning • Phylogenetic ecology - especially important sampling Weighted % of Clones Al ph ap ro 0 0.1250 0.2500 0.3750 0.5000 Be te Al ta ob ph G 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 pr ac a poor genomic am ot te m eo ria Be pro ap ba ro ct ta teo D te er G p b el ob ia ta • Assigning reads to in past pr ac am rot ac Ep ot te U si lo eo ria m eo te nc ba ba ria la np Ep ap ss ro ct ct ifi te er si rot ed ob ia lo Pr ac n eo eria ot te De pr ba eo ria ba lta ote cte Cy ct pr ob ria an er ob ia o a by ac C teo cte Ch te ya b ri ria la no ac a m b te Ac yd id ia ob e Fi act ria rm er Ba act ct er ia Ac ic ia Uses of phylogenetic er ut Ac oi tin es de tin te ob ob s a ac te C cte ria hl ri Aq or a Pl ui an fic ob ct om ae C i yc FB Sp et C iro es hl ch o ae te Major Phylogenetic Group Fi Sp rof rm s ic iro lex i Sargasso Phylotypes ut classification in metagenomics Ch es Fu cha lo ro De U fle so ete nc xi in ba s la Ch oc ss lo ct ifi ro oc ed bi er Ba Ecus ia ct ur - er ia yaTh C rcherm re na aeous frr tsf t pgk rplL rplF rplP rplT rplE infC rpsI rplS rplA rplB rplK rplC rpsJ rc rplN rplD rplM rpsE rpsS rpsB rpsK rpsC rpoB rpsM pyrG nusA dnaG rpmA smpB ha a eo ta
  • 66. Metagenomic Analysis Improves w/ Phylogenetic Sampling • Small but real improvements in –Gene identification / confirmation –Functional prediction –Binning –Phylogenetic classification
  • 67. Metagenomic Analysis Improves w/ Phylogenetic Sampling • Small but real improvements in –Gene identification / confirmation –Functional prediction –Binning –Phylogenetic classification • But not a lot ...
  • 68. GEBA Future 1 Need to adapt genomic and metagenomic methods to make use of GEBA data
  • 69. Al ph ap ro Be te ta ob G 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 pr ac am ot te m eo ria ap ba ro ct D te er el ob ia ta pr ac Ep ot te U si lo eo ria nc ba la np ss ro ct ifi te er ed ob ia Pr ac ot te eo ria ba Cy ct an er ob ia ac Ch te ria la m Ac yd id ia ob e Ba act ct er er ia Ac oi de tin te ob s ac te ria Aq Pl ui an fic ct om ae yc Sp et AMPHORA - each read on its own tree iro es ch ae Fi te rm s ic ut Ch es Improves with better lo ro U fle nc phylogenetic methods la xi ss Ch ifi lo ed ro bi Ba ct er ia Phylogenetic Binning Using AMPHORA frr tsf pgk rplL rplF rplP rplT rplE infC rpsI rplS rplA rplB rplK rplC rpsJ rplN rplD rplM rpsE rpsS rpsB rpsK rpsC rpoB rpsM pyrG nusA dnaG rpmA smpB
  • 70. Improving Phylogeny for Metagenomic Reads • Examples using reference trees – AMPHORA (Wu and Eisen) – PPlacer (Erik Matsen) – FastTree (Morgan Price) • Variants – Use concatenated alignment of markers not just individual genes (Steven Kembel) – Apply to OTU identification not just classification (Thomas Sharpton) – CoBinning: look for linkage among fragments/genes (Aaron Darling)
  • 71. Al ph ap ro Be te ta ob G 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 pr ac am ot te m eo ria ap ba ro ct D te er el ob ia ta pr ac Ep ot te U si lo eo ria nc ba la np ss ro ct ifi te er ed ob ia Pr ac ot te eo ria ba Cy ct an er ob ia ac Ch te ria la m Ac yd id ia ob e Ba act ct er er ia Ac oi de tin te ob s ac gene families te ria Aq Pl ui an fic ct om ae yc Sp et AMPHORA - each read on its own tree iro es ch ae Fi te rm s ic ut Improves with more Ch es lo ro U fle nc xi la Ch ss lo ifi ro ed bi Ba ct er ia Phylogenetic Binning Using AMPHORA frr tsf pgk rplL rplF rplP rplT rplE infC rpsI rplS rplA rplB rplK rplC rpsJ rplN rplD rplM rpsE rpsS rpsB rpsK rpsC rpoB rpsM pyrG nusA dnaG rpmA smpB
  • 72. Identifying new markers • Take all genomes • All vs. all search • Identify protein families • For each family measure –Evenness in copy number –Universality –Phylogenetic congruence with WGT –Monophyly for superfamilies
  • 73. Distances between gene trees and the AMPHORA concatenated genome tree rpmA coaE coaE rpmA trmD rplL rpsS rpsQ radA rplR rplD rplQ tsf rpsH frr smpB ttf rpsO rplR rplP rplM rpsS rplI rplV rpsB rplT rpsO rplO mraW rpsP rpsH rpsK rplQ rplU rplL tsf rplT trmD rplE rplS rpsP ttf rplC rpsI rplV mraW rplS rpsL infC rpsG rpsM rplM rplO rplI rplU pyrH rpsL rpsM rpsQ ruvA guaA radA rpsG purA smpB rplK priA rplD rpsK infC rplK rplC serS rplE rplA rplA rplF frr ruvA rplF rpsC serS rplN rplN rplP guaA rpsE ruvB pyrH rpsB rpsI rpsJ secY rRNA16S rpsJ secY purA rplB rplB priA nusA rpsE ruvB rpsC rRNA16S nusA 0 1 2 3 4 5 6 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 NODAL distance SPLIT distance AMPHORA marker Ribosomal protein Transcription/translation related protein DNA repair protein Protein of other function Distance between the genome tree and 100 random trees (average ± standard deviation)
  • 74. Identifying new phylogenetic markers within phyla • Take all genomes within a phylum • All vs. all search • Identify protein families • For each family measure –Evenness in copy number –Universality –Phylogenetic congruence with WGT –Monophyly for superfamilies
  • 75. Keep only the families with: Universality * Evenness * monophyly >= 90*90*90 Phylogenetic group Genome Number Gene Number Maker Candidates Archaea 62 145415 102 Actinobacteria 63 267783 136 Alphaproteobacteria 94 347287 142 Betaproteobacteria 56 266362 294 Gammaproteobacteria 126 483632 141 Deltaproteobacteria 25 102115 44 Epislonproteobacteria 18 33416 446 Bacteriodes 25 71531 179 Chlamydae 13 13823 561 Chloroflexi 10 33577 140 Cyanobacteria 36 124080 532 Firmicutes 106 312309 80 Spirochaetes 18 38832 72 Thermi 5 14160 727 Thermotogae 9 17037 646
  • 76. Al ph ap ro Be te ta ob G 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 pr ac am ot te m eo ria ap ba ro ct D te er el ob ia ta pr ac Ep ot te U si lo eo ria nc ba la np ss ro ct ifi te er ed ob ia Pr ac ot te eo ria ba Cy ct an er ob ia ac Ch te ria la m Ac yd id ia ob e Ba act ct er er ia Ac oi de tin te ob s Other needs? ac te ria Aq Pl ui an fic ct om ae yc Sp et AMPHORA - each read on its own tree iro es ch ae Fi te rm s ic ut Ch es lo ro U fle nc xi la Ch ss lo ifi ro ed bi Ba ct er ia Phylogenetic Binning Using AMPHORA frr tsf pgk rplL rplF rplP rplT rplE infC rpsI rplS rplA rplB rplK rplC rpsJ rplN rplD rplM rpsE rpsS rpsB rpsK rpsC rpoB rpsM pyrG nusA dnaG rpmA smpB
  • 77. Other Ways to Make Better Use of GEBA Data • Rebuild protein family models • Experiments from across the tree needed • Need better phylogenies, including HGT • Improved tools for using distantly related genomes in metagenomic analysis • Better recording and sharing of metadata about organisms
  • 78. GEBA Future 2 The dark matter of the biological universe
  • 79. rRNA Tree of Life Bacteria Archaea Eukaryotes FIgure from Barton, Eisen et al. “Evolution”, CSHL Press. Based on tree from Pace NR, 2003.
  • 80. Phylogenetic Diversity: Sequenced Bacteria & Archaea From Wu et al. 2009
  • 81. Phylogenetic Diversity with GEBA From Wu et al. 2009
  • 82. Phylogenetic Diversity: Isolates From Wu et al. 2009
  • 83. Phylogenetic Diversity: All From Wu et al. 2009
  • 84. Fantasy analysis of # PFAMs GEBA Genomes PD/Genome ~0.1 PFAMs/Genome ~1000 PFAMs/PD ~10000 Total PFAMS ~10,000,000 From Wu et al. 2009
  • 85. Conclusions • Sequencing phylogenetically novel genomes has many benefits • To obtain the most benefits, we need to change and adapt: computationally and experimentally • Most of the phylogenetic diversity of microbes remains to be sampled • Long live the Lake Arrowhead Microbial Genomes meeting
  • 86.
  • 88. Proteobacteria • GEBA TM6 OS-K • At least 40 phyla Acidobacteria • A genomic Termite Group OP8 of bacteria encyclopedia Nitrospira Bacteroides • Genome Chlorobi of bacteria and Fibrobacteres Marine GroupA sequences are archaea WS3 Gemmimonas mostly from Firmicutes Fusobacteria Actinobacteria three phyla OP9 Cyanobacteria Synergistes • Some other Deferribacteres Chrysiogenetes phyla are only NKB19 Verrucomicrobia Chlamydia sparsely sampled OP3 Planctomycetes Spriochaetes • Solution: Really Coprothmermobacter OP10 Thermomicrobia Fill in the Tree Chloroflexi TM7 Deinococcus-Thermus Dictyoglomus Aquificae Eisen & Ward, PIs Thermudesulfobacteria Thermotogae OP1 OP11
  • 89. Thanks Institutions $$$$ JGI etc DOE UC Davis NSF DSMZ GBMF TIGR People Dongying Wu Phil Hugenholtz Nikos Kyrpides FIgure from Barton, Eisen et al. Hans-Peter Klenk “Evolution”, CSHL Press. Eddy Rubin Based on tree from Pace NR, 2003.

Notas del editor

  1. Gets better with more markers - but we do not have lots of sequences for these markers. We can get them from genomes. The more diverse the genomes, thebeter the marker set will be
  2. Gets better with more markers - but we do not have lots of sequences for these markers. We can get them from genomes. The more diverse the genomes, thebeter the marker set will be
  3. Gets better with more markers - but we do not have lots of sequences for these markers. We can get them from genomes. The more diverse the genomes, thebeter the marker set will be