5. As of 2002 Proteobacteria
TM6
OS-K ⢠At least 40
Acidobacteria
Termite Group
OP8
phyla of
Nitrospira
Bacteroides bacteria
Chlorobi
Fibrobacteres
Marine GroupA
WS3
Gemmimonas
Firmicutes
Fusobacteria
Actinobacteria
OP9
Cyanobacteria
Synergistes
Deferribacteres
Chrysiogenetes
NKB19
Verrucomicrobia
Chlamydia
OP3
Planctomycetes
Spriochaetes
Coprothmermobacter
OP10
Thermomicrobia
ChloroďŹexi
TM7
Deinococcus-Thermus
Dictyoglomus
AquiďŹcae
Thermudesulfobacteria
Thermotogae
OP1 Based on
OP11 Hugenholtz, 2002
6. As of 2002 Proteobacteria
TM6
OS-K
⢠At least 40
Acidobacteria
Termite Group
OP8
phyla of
Nitrospira
Bacteroides bacteria
Chlorobi
Fibrobacteres
Marine GroupA ⢠Genome
WS3
Gemmimonas
Firmicutes
sequences are
Fusobacteria
Actinobacteria
mostly from
OP9
Cyanobacteria
Synergistes
three phyla
Deferribacteres
Chrysiogenetes
NKB19
Verrucomicrobia
Chlamydia
OP3
Planctomycetes
Spriochaetes
Coprothmermobacter
OP10
Thermomicrobia
ChloroďŹexi
TM7
Deinococcus-Thermus
Dictyoglomus
AquiďŹcae
Thermudesulfobacteria
Thermotogae
OP1 Based on
OP11 Hugenholtz, 2002
7. As of 2002 Proteobacteria
TM6
OS-K
⢠At least 40
Acidobacteria
Termite Group
OP8
phyla of
Nitrospira
Bacteroides bacteria
Chlorobi
Fibrobacteres
Marine GroupA ⢠Genome
WS3
Gemmimonas
Firmicutes
sequences are
Fusobacteria
Actinobacteria
mostly from
OP9
Cyanobacteria
Synergistes
three phyla
Deferribacteres
Chrysiogenetes
NKB19
⢠Some other
Verrucomicrobia
Chlamydia
OP3
phyla are
Planctomycetes
Spriochaetes only sparsely
Coprothmermobacter
OP10
Thermomicrobia
sampled
ChloroďŹexi
TM7
Deinococcus-Thermus
Dictyoglomus
AquiďŹcae
Thermudesulfobacteria
Thermotogae
OP1 Based on
OP11 Hugenholtz, 2002
8. As of 2002 Proteobacteria
TM6
OS-K
⢠At least 40
Acidobacteria
Termite Group
OP8
phyla of
Nitrospira
Bacteroides bacteria
Chlorobi
Fibrobacteres
Marine GroupA ⢠Genome
WS3
Gemmimonas
Firmicutes
sequences are
Fusobacteria
Actinobacteria
mostly from
OP9
Cyanobacteria
Synergistes
three phyla
Deferribacteres
Chrysiogenetes
NKB19
⢠Some other
Verrucomicrobia
Chlamydia
OP3
phyla are
Planctomycetes
Spriochaetes only sparsely
Coprothmermobacter
OP10
Thermomicrobia
sampled
ChloroďŹexi
TM7
Deinococcus-Thermus
⢠Same trend in
Dictyoglomus
AquiďŹcae
Thermudesulfobacteria
Archaea
Thermotogae
OP1 Based on
OP11 Hugenholtz, 2002
9. Need for Tree Guidance Well Established
⢠Common approach within some eukaryotic
groups
⢠Many small projects funded to ďŹll in some
bacterial or archaeal gaps
⢠Phylogenetic gaps in bacterial and archaeal
projects commonly lamented in literature
10. Proteobacteria
TM6
OS-K
⢠At least 100 phyla of
Acidobacteria
Termite Group
OP8
bacteria
Nitrospira
Bacteroides
Chlorobi
⢠Genome sequences are
Fibrobacteres
Marine GroupA mostly from three phyla
WS3
Gemmimonas
Firmicutes ⢠Most phyla with cultured
Fusobacteria
Actinobacteria species are sparsely
OP9
Cyanobacteria
Synergistes
sampled
Deferribacteres
Chrysiogenetes
NKB19 ⢠Lineages with no cultured
Verrucomicrobia
Chlamydia
OP3
taxa even more poorly
Planctomycetes
Spriochaetes sampled
Coprothmermobacter
OP10
Thermomicrobia
ChloroďŹexi
⢠Solution - use tree to really
TM7
Deinococcus-Thermus ďŹll gaps
Dictyoglomus
AquiďŹcae Well sampled phyla
Thermudesulfobacteria
Thermotogae
OP1
OP11
12. GEBA Pilot Project Overview
⢠Identify major branches in rRNA tree for
which no genomes are available
⢠Identify a cultured representative for each
group
⢠Grow > 200 of these and prep. DNA
⢠Sequence and ďŹnish 100
⢠Annotate, analyze, release data
⢠Assess beneďŹts of tree guided sequencing
13. GEBA Pilot Project: Components
⢠Project overview (Phil Hugenholtz, Nikos Kyrpides, Jonathan
Eisen, Eddy Rubin, Jim Bristow)
⢠Project management (David Bruce, Eileen Dalin, Lynne Goodwin)
⢠Culture collection and DNA prep (DSMZ, Hans-Peter Klenk)
⢠Sequencing and closure (Eileen Dalin, Susan Lucas, Alla Lapidus,
Mat Nolan, Alex Copeland, Cliff Han, Feng Chen, Jan-Fang Cheng)
⢠Annotation and data release (Nikos Kyrpides, Victor Markowitz, et
al)
⢠Analysis (Dongying Wu, Kostas Mavrommatis, Martin Wu, Victor
Kunin, Neil Rawlings, Ian Paulsen, Patrick Chain, Patrik
DâHaeseleer, Sean Hooper, Iain Anderson, Amrita Pati, Natalia N.
Ivanova, Athanasios Lykidis, Adam Zemla)
⢠Adopt a microbe education project (Cheryl Kerfeld)
⢠Outreach (David Gilbert)
⢠$$$ (DOE, Eddy Rubin, Jim Bristow)
18. Most/All Functional Prediction Improves
w/ Better Phylogenetic Sampling
⢠Better deďŹnition of protein family sequence âpatternsâ
⢠Greatly improves âcomparativeâ and âevolutionaryâ
based predictions
⢠Conversion of hypothetical into conserved
hypotheticals
⢠Linking distantly related members of protein families
⢠Improved non-homology prediction
Kostas Natalia Thanos Nikos Iain
Mavrommatis Ivanova Lykidis Kyrpides Anderson
22. Metagenomic Analysis Improves
Sean
Hooper ⢠Small but real
improvement
in
metagenomic
Amrita
Pati annotation
and analysis
23. GEBA Lesson 4
We have still only scratched the
surface of microbial diversity
24. Protein Family Rarefaction Curves
⢠Take data set of multiple complete genomes
⢠Identify all protein families using MCL
⢠Plot # of genomes vs. # of protein families
25.
26.
27.
28.
29.
30. Phylogenetic Distribution Novelty: 1st
Bacterial Actin Related Protein
Victor
Kunin
Patrik
Dâhaeseleer
Adam Zemla
Haliangium ochraceum DSM 14365
34. Proteobacteria
TM6
OS-K
⢠At least 40 phyla of
Acidobacteria
Termite Group
OP8
bacteria
Nitrospira
Bacteroides
Chlorobi
⢠Genome sequences are
Fibrobacteres
Marine GroupA mostly from three phyla
WS3
Gemmimonas
Firmicutes ⢠Most phyla with cultured
Fusobacteria
Actinobacteria species are sparsely
OP9
Cyanobacteria
Synergistes
sampled
Deferribacteres
Chrysiogenetes
NKB19 ⢠Lineages with no cultured
Verrucomicrobia
Chlamydia
OP3
taxa even more poorly
Planctomycetes
Spriochaetes sampled
Coprothmermobacter
OP10
Thermomicrobia
ChloroďŹexi
TM7
Deinococcus-Thermus
Dictyoglomus
AquiďŹcae Well sampled phyla
Thermudesulfobacteria
Thermotogae Poorly sampled
OP1
OP11
No cultured taxa
35. Uncultured Lineages:
Technical Approaches
⢠Get into culture
⢠Enrichment cultures
⢠If abundant in low diversity ecosystems
⢠Flow sorting
⢠Microbeads
⢠MicroďŹuidic sorting
⢠Single cell ampliďŹcation
42. SIGS
⢠The Genomic Standards Consortium
⢠The GSC is an open-membership working body which
formed in September 2005.
⢠The goal of this international community is to promote
mechanisms that standardize the description of
genomes and the exchange and integration of
genomic data.
⢠See http://gensc.org/gc_wiki/index.php/Main_Page