2. An
introduc7on
to
the
Coffea
Genus
The
Coffea
genus
belongs
to
the
Rubiaceae
family
Fourth
Angiosperm
family:
650
genera
and
13,000
species
Other
known
genera
in
the
family:
Gardenia
Cinchona
(quinine)
Rubia
Ixora
(madder)
3. An
introduc7on
to
the
Coffea
Genus
The
Coffea
genus
has
been
recently
increased
with
the
addi7on
of
the
former
genus
Psilanthus.
This
“new”
enlarged
genus
contains
124
described
species
origina7ng
from
Africa,
Madagascar
and
other
Indian
Ocean
islands,
Asia
and
Australia.
From
Davis
et
al.
Botanical
Journal
of
the
Linnean
Society,
2011,
167:
357–377.
4. An
introduc7on
to
the
Coffea
Genus
The
Coffea
genus
is
very
diverse,
it
includes
the
previously
called
subgenus
Coffea,
the
Baracoffea
alliance
and
the
former
Psilanthus
genus
which
was
itself
divided
in
2
subgenera.
The
subgenus
Coffea
is
dived
in
3
botanical
sec7ons:
The
Eucoffea,
found
in
West
and
Central
Africa
The
Mozambicoffea,
found
in
East
Africa
The
Mascarocoffea,
found
in
Madagascar
and
some
Indian
Ocean
Islands.
C
w
E
MMozambicoffea
Eucoffea
Mascarocoffea
5. An
introduc7on
to
the
Coffea
Genus
The
Baracoffea
alliance
is
exclusively
encountered
in
western
Madagascar.
The
ex-‐Psilanthus
species
are
more
widely
spread
as
they
are
origina7ng
from
Africa,
Madagascar,
Asia
and
Australia
6. An
introduc7on
to
the
Coffea
Genus
C.
arabica
is
the
sole
tetraploid
(2n=4x=44)
of
the
genus
and
one
of
the
rare
auto
fer7le
species.
All
the
others
are
diploid
(2n=2x=22)
and
almost
all
are
allogamous.
C.
arabica
is
an
allotetraploid
resul7ng
from
a
spontaneous
hybridiza7on
between
C.
canephora
and
a
wild
East
African
species:
C.
eugenioides.
It
is
a
recent
event
<
0.6
Mya
x
C.
canephora
♂
C.
eugenioides
♀
C.
arabica
7. An
introduc7on
to
the
Coffea
Genus
The
Coffea
genus
has
a
large
phenotypic
diversity
C.
macrocarpa
Mas
C.
pterocarpa
Mad
C.
liberica
WA
C.
brevipes
W/CA
C.
congensis
W/CA
C.
eugenioides
EA
C.
millo5i
Mad
C.
racemosa
EA
C.
kapakata
W/S
C.
pseudozanguebariae
EA
C.
arabica
EA
C.
liberica
var
Koto
W/CA
8. Coffee
economical
importance
Out
of
the
124
species,
only
2
are
widely
cul7vated:
C.
arabica
(Arabica)
and
C.
canephora
(Robusta).
65-‐70%
and
35-‐30%
respec7vely.
Second
trade
product
exported
by
Southern
countries
(aber
oil).
400
Billions
of
coffee
cup
drunk
every
year;
12,000
each
second.
Grown
all
over
the
world
in
intertropical
regions
Robusta
Both
Arabica
10. Molecular
markers
Molecular
markers
are
used
for:
Iden7fying
the
gene7c
diversity
of
popula7ons
/
species
Establishing
the
gene7c
structure
of
popula7ons
/
species)
Iden7fying
species
/
individuals
(Finger
prin7ng
–
barcoding)
Establishing
gene7c
maps
The
most
used
markers
nowadays
are:
SSR:
Single
Sequence
Repeats
=
microsatellites
SNP:
Single
Nucleo7de
Polymorphism
Both
have
their
sequence
known,
they
are
numerous
in
any
genome
and
they
are
co-‐
dominant.
11. Molecular
markers
A
large
set
of
molecular
markers
is
established,
SSR
and
SNPs.
These
markers
are
compiled
in
two
public
data
banks:
MoccaDB
and
SGN
Plechakova
et
al.
BMC
Plant
Biology,
2009;
9:
123.*
Mueller
et
al.
Pl.
Physiol.
2005
138:
1310-‐1317
12. Gene7c
maps
A
C.
canephora
saturated
gene7c
map.
SSR,
SNPs
and
BACs
were
used
to
construct
this
map.
The
present
interna7onal
map
contains
≈3000
markers,
mainly
SNPs
No
saturated
C.
arabica
gene7c
maps
are
available
yet.
From:
de
Kochko
et
al.
Advances
in
Botanical
Research;
2010,
53:
23-‐63.
*
13. Genome
size
Coffea
genome
sizes
vary
from
simple
to
double:
From:
Cross
et
al.
Can.
J.
Bot.
73:
14-‐20;
-‐
Noirot
et
al.
Ann
Bot
92:
709-‐714*;
-‐
Razafinarivo
et
al.
TGG
in
press
(December
2012
issue)
*
14. Chromosome
organiza7on
From:
Hamon
et
al.
Chr.
Res.
2009,
17:
291-‐304*
Schema7c
representa7on
of
chromosomes
in
different
Coffea
species.
5SrDNA
are
in
green
18SrDNA
are
in
red
West
and
Central
Africa
species
present
1
satellite
chromosome
as
well
as
Malagasy
ones
while
East
African
species
have
two.
The
genus
presents
a
differen7al
chromosome
structural
organiza7on
15. Genome
size
and
structure
There
is
a
geographical
related
divergence
in
the
genome
size
and
chromosome
organiza7on
DEW
(1.41)
LIB
(1.41)
HUM
(1.76)
EUG
(1.36)
HET
(1.74)
CAN
(1.45)
PSE
(1.13)
RAC
(1.03)
MIL
(1.32)
TET
(1.07)
16. EST
and
RNASeq
Publicly
available
ESTs:
254
474
Sanger
ESTs
in
total
Mainly
origina7ng
from
the
two
cul7vated
species:
174
275
ESTs
for
C.
arabica;
from
different
organs
and
7ssues
and
from
rust
infected
leaves
69
066
ESTs
for
C.
canephora
also
from
different
organs
and
7ssues
10
838
ESTs
for
C.
racemosa,
a
wild
East
African
species
drought
tolerant
295
are
from
different
sources,
hybrid
plants
and
only
18
from
C.
eugenioides
a
puta7ve
parent
of
C.
arabica
Non
publicly
available
and
NGS
cDNA
sequences
are
much
more
numerous,
e.g.
the
C.
canephora
sequencing
consor7um
project
produced
130.106
Illumina
reads.
17. BAC
libraries
For
C.
canephora:
One
BAC
library
from
the
genotype
126,
an
improved
cul7var.
DNA
digested
with
HindIII
Two
libraries
from
the
genotype
HD200-‐94,
a
double
haploid
used
for
genome
sequencing.
DNA
digested
with
HindIII
and
BstYI.
Leroy
et
al.
2005;
TAG.
111:
1032-‐1041
-‐
de
Kochko
et
al.
2010;
Ad.
Bot.
Res.
53:
22-‐63*
For
C.
arabica:
One
library
from
the
variety
IAPAR59,
an
improved
variety.
DNA
digested
with
HindIII
One
library
from
the
Mokka
variety.
DNA
digested
with
HindIII
Noir
et
al.
2004;
Theor.
Appl.
Genet.
109:
225-‐230
–
Jones
et
al.
2005;
21st
ASIC
conference
BAC
libraries
have
exclusively
been
build
for
the
two
cul7vated
species
19. General
structure
of
Class
II
elements
-‐
DNA
transposons
ITR
=
Inverted
Terminal
Repeat
Transposase
ITR
ITR
CAGC...
GTCG...
...GCTG
...CGAC
Transposable
elements
MITE autonomous copy
trans
ORF
20. Class
I
transposable
elements:
Retrotransposons
Structure
of
a
LTR
retrotransposon
gag=
capside
protein
(Group
An7Gene)
Transposable
elements
UTR
gag
pol
pol=
polyprotein
contains
all
the
func7ons
for
the
element
replica7on
(polymerase)
LTR
5'
LTR
3'
UTR=
Untranslated
region
UTR
21. The
other
Class
I
elements:
LINEs
et
SINEs
(Retroposons
or
non-‐LTR
retroelements)
Transposable
elements
SINEs
gag
LINEs
pol
22. Coffea
Transposable
elements
Iden7fica7on
and
use
of
transposable
elements
in
Coffea
has
been
ini7ated
only
recently.
Iden7fica7on
of
TE
casseqes
in
ESTs
and
unigenes.
Lopes
et
al.
2008,
Mol.
Genet.
Geno.
279:
385-‐401
Iden7fica7on
of
a
MITE
inserted
in
an
intron
and
its
use
for
diversity
study.
Guyot
et
al.
2009
,
BMC
Pl.
Biol.
9:22*
–
Dubreuil-‐Tranchant
et
al.
Int.
J.
Evol.
Biol.
2011
ID
358412*
Iden7fica7on
and
use
of
full
length
LTR-‐Retrotransposons
for
diversity
study.
Hamon
et
al.
2011,
Mol.
Genet.
Geno.
285:
447-‐460*
Iden7fica7on
of
full
length
transposable
elements
in
BAC
clones.
Cenci
et
al.
2012,
Pl.
Mol.
Biol.
78:
135-‐145
Iden7fica7on
of
LTR-‐Retrotransposons
in
BAC-‐ends
and
NGS
reads.
Dubreuil-‐Tranchant
et
al.
2012,
2nd
ICTE*
–
Dias
et
al.
2013
21st
PAG*
23. Coffea
Transposable
elements
LTR-‐retrotransposon
REMAP
Microsatellite
repeats
mul?-‐locus
approaches
for
analyzing
transposon
inser?ons
RBIP
Retrotransposon-‐Based
Inser?onal
Polymorphism
REtrotransposon-‐Microsatellite
Amplified
Polymorphism
Sequence-‐Specific
Amplified
Polymorphism
Restric?on
site
S-‐SAP
How
to
use
transposable
elements
for
diversity
studies
24. Coffea
Transposable
elements
Using
a
MITE
for
polymorphism
survey
From:
Guyot
et
al.
2009
BMC
Pl.
Biol.
9:22*
From:
Dubreuil-‐Tranchant
et
al.
Int.
J.
Evol.
Biol.
2011
ID
358412*
Intra
C.
canephora
Alex-‐1
polymorphism
at
the
g3
locus:
25. Coffea
Transposable
elements
Divo
4396
bp
LTR
pair
iden7ty
94.5%
5749
bp
Nana
LTR
pair
iden7ty
90.5%
First
full
length
LTR
Retrotransposons
iden7fied
in
Coffea
Hamon
et
al.
2011,
Mol.
Genet.
Geno.
285:
447-‐460*
26. Coffea
Transposable
elements
resolve
Coffea
species
lineages
reveal
intra
LIB
and
CAN
differen7a7on
Diversity
of
inser7on
paqern
Hamon
et
al.
2011,
Mol.
Genet.
Geno.
285:
447-‐460*
28. Synteny
studies:
at
the
micro
level
From:
Guyot
et
al.
2009
BMC
Pl.
Biol.
9:22*
29. At
the
micro
level:
Both
studies
show
a
good
conserva7on
of
synteny
despite,
and
independently,
of
the
divergence
7me
between
species
From:
Guyot
et
al.
2012
BMC
Genomics
13:103*
Synteny
studies:
at
the
micro
level
From:
Guyot
et
al.
2009
BMC
Pl.
Biol.
9:22*
30. Macrosyntenic
rela7onships
between
each
of
the
11
coffee
Linkage
Groups
and
the
19
grape
Linkage
Groups
based
on
mapped
coffee
COSII
loci.
From:
Guyot
et
al.
2012
BMC
Genomics
13:103*
Thanks
to
a
set
of
867
COSII
markers,
macrosynteny
was
detected
between
coffee,
tomato
and
grapevine.
While
coffee
and
tomato
genomes
share
318
orthologous
markers
and
27
conserved
syntenic
segments,
coffee
and
grapevine
share
299
syntenic
markers
and
29
CSSs.
Synteny
studies:
at
the
macro
level
31. Macrosyntenic
rela7onships
between
each
of
the
11
coffee
Linkage
Group
and
the
12
tomato
Linkage
Groups
based
on
mapped
coffee
COSII
loci.
From:
Guyot
et
al.
2012
BMC
Genomics
13:103*
Synteny
studies:
at
the
macro
level
32. Significant
conserva7on
is
found
between
distantly
related
species
from
the
Asterid
and
Rosid
clades,
at
the
genome
macrostructure
and
microstructure
levels.
Time
alone
doesn’t
explain
the
observed
divergences
Synteny
analyses
are
considerably
useful
for
syntenic
studies
between
supposedly
remote
species
for
the
isola7on
of
important
genes
for
agronomy.
From:
Guyot
et
al.
2009
BMC
Pl.
Biol.
9:22*
-‐
Guyot
et
al.
2012
BMC
Genomics
13:103*
Synteny
studies:
Conclusion
34. Phylogene7c
analyses
of
Coffea
From:
Davis
et
al.
Bot.
J.
Linnean
So.
2011,
167,
357–377.
Combined
plas7d–ITS
Bayesian
majority
rule
consensus
phylogene7c
tree
35. Phylogene7c
analyses
of
Coffea
Combined
plas7d–ITS
maximum
likelihood
phylogene7c
tree
Whatever
the
method
of
analysis,
these
results
do
not
allow
to
conclude
on
Coffea
evolu7on,
the
different
clades
being
not
hierarchized.
AW
Ex-‐PSI
AC
AE
MAS
MAD
36. Phylogene7c
analyses
of
Coffea
20
COS
par7ally
sequenced
(exons
+
intron)
72
Coffea
species
1st
divergence:
ex-‐Psilanthus
2nd
divergence:
3
non
hierarchized
clades:
Baracoffea/
Mascarocoffea/
Africa.
Psilanthus
Baracoffea
Madagascar
Mascarene
Madagascar
and
Comoros
East
Africa
East
Africa
West
and
Central
Africa
37. Psi
Bar
Coffea
Phylogene7c
analyses
of
Coffea
A
hypothesis
on
Coffea
origin
and
evolu7on:
Psi-‐Coffea
common
ancestor
Coffea
Psi
Psi
39. Genome
sequencing
The
sequenced
genotype
belongs
to
the
C.
canephora
species.
C.
canephora
was
chosen
because
it
is
diploid,
contrary
to
C.
arabica
which
is
an
allotetraploid.
The
sequenced
plant
is
a
double
haploid
(mixoploid)
plant
produced
by
IRD
from
haploid
embryo
and
conserved
in
tropical
green
houses
in
Montpellier
(France).
Plant
Material:
40. Genome
sequencing
Sequencing
Strategy:
Two
steps:
to
produce
a
first
assembly
with:
454
reads,
single
and
mate
ended
(8
and
20
kb
span)
Sanger
sequencing
of
Bac
Ends
Correct
the
assembly
with
Illumina
sequencing,
single
and
pair
ended
reads
41. Assembly
results
:
Genome
sequencing
13,345
scaffolds,
largest
scaffold
9.Mb
N50
=
1.2Mb
N80
=65kb
Coverage
reached:
28.8
X
454
69.7
X
Illumina
0.3
X
Sanger
Total
=
98.8
X
Total
length
assembled
568.6
Mb
(80%
of
the
710
Mb
es7mated
size)
Con7gs
Reads
of
different
origin
Consensus
Pair-‐
mate-‐
end
reads
Gaps
=
span
of
pair
or
mate
end
fragments
Scaffolds
42. Number
of
genes
25574
Number
of
genes
without
intron
5004
Size
in
nt.
(mean
:
med.)
3684.33
:
2788
Exon
number
/
gene
(mean
:
med.)
5.10
:
4
CDS
size
in
nt.
(mean
:
med.)
1205.55
:
1002
Coding
coverage
30,830,841
(5.4%)
Intron
number
104,944
Intron
size
in
nt.
(mean
:
med.)
483.20
:
208
%
con7gs
with
at
least
one
gene.
(%
in
bases)
16.6%
(82.3%)
Automa7c
annota7on
results
:
Genome
sequencing
43. Genome
sequencing
Further
steps
:
To
anchor
the
physical
map
(assembly)
to
the
interna7onal
gene7c
map
(≈3000
SNP
markers)
Annotate
manually
some
genes
from
Coffea
par7cular
pathways
(Caffeine…)
Compara7ve
genomics
Many
other
possible
analyses
Publish!
Shulaev
V.
et
al
(2011)
Nature
Genet.
43:
109–116
Coffea
canephora
44. Evodyn
Team
members
Perla
HAMON
Romain
GUYOT
Chris7ne
DUBREUIL-‐TRANCHANT
Valérie
PONCET
Serge
HAMON
Students,
trainees
and
visitors,
among
them:
N.
Razafinarivo,
P.O.
Duroy,
C.
Duret,
A.
Guellim,
M.
de
la
Mare,
S.
Akaffou,
P.
Mafra
Almeida
da
Costa,
C.
Gomez,
Elaine
Dias
…
Collaborators:
Dominique
CROUZILLAT
Michel
RIGOREAU
Emmanuel
COUTURON
Claudia
CARARETO
Spencer
BROWN
Michael
BOURGE
Vincent
LEFORT
Olivier
GASCUEL
Olivier
CORITON
Sonja
SILJAK-‐YAKOVLEV
Odile
ROBIN
Saranya
SRISUWAN
Aaron
DAVIS
Philippe
BARRE
And
many
more
Acknowledgements
The
Interna7onal
Coffee
sequencing
consor7um:
Victor
A.
ALBERT
(USA)
Alan
C.
ANDRADE
(BRE)
Xavier
ARGOUT
(FR)
Benoit
BERTRAND
(FR)
Alexandre
de
KOCHKO
(FR)
Giovanni
GIULIANO
(ITA)
Giorgio
GRAZIOSI
(ITA)
Robert
HENRY
(AUS)
JAYARAMA
(IND)
Philippe
LASHERMES
(FR)
Ray
MING
(USA)
Chifumi
NAGAI
(USA)
Steve
ROUNSLEY
(USA)
David
SANKOFF
(CAN)
Patrick
WINCKER
(FR)