Formation of low mass protostars and their circumstellar disks
Evolutionary Genetics of Complex Genome
1. Evolutionary Genetics of a
Complex Plant Genome
Jeffrey Ross-Ibarra
@jrossibarra • www.rilab.org
Dept. Plant Sciences • Center for Population Biology • Genome Center
University of California Davis
19. Ne individuals, µ beneficial mutation rate per trait
bigger genome, larger mutation target, higher µ
predict that larger genomes adapt via
standing variation, noncoding variants
20. Ne individuals, µ beneficial mutation rate per trait
bigger genome, larger mutation target, higher µ
predict that larger genomes adapt via
standing variation, noncoding variants
selection from standing variation when 2Neµ > 1
21. Ne individuals, µ beneficial mutation rate per trait
bigger genome, larger mutation target, higher µ
predict that larger genomes adapt via
standing variation, noncoding variants
selection from standing variation when 2Neµ > 1
larger % of µ should be noncoding
24. 1 2 3 4 5
6 7 8 9 10
tb1
Studer et al. 2011 Nature Genetics.; Vann et al. 2015 PeerJ
GENETICS ADVANCE ONLINE PUBLICATION 3
nguish maize and teosinte. Both the maize and teosinte
s for the distal component repressed luciferase expression
luc
luc
luc
luc
luc
luc
Hopscotch
mpCaMV
M-dist
T-prox
M-prox
0 0.5 1.0 1.5 2.0
∆M-dist
∆M-prox
ProximalcontrolregionDistal
Constructs and corresponding normalized luciferase expression
nsient assays were performed in maize leaf protoplast. Each
is drawn to scale. The construct backbone consists of the
romoter from the cauliflower mosaic virus (mpCaMV, gray box),
ORF (luc, white box) and the nopaline synthase terminator
). Portions of the proximal and distal components of the
gion (hatched boxes) from maize and teosinte were cloned
ction sites upstream of the minimal promoter. “ ” denotes
on of either the Tourist or Hopscotch element from the maize
Horizontal green bars show the normalized mean with s.e.m.
onstruct.
relative expressionconstruct
25. 1 2 3 4 5
6 7 8 9 10
tb1
Figure 2 Map of parviglumis Populations and Hopscotch allele frequency. Map showing the frequency
of the Hopscotch allele in populations of parviglumis where we sampled more than 6 individuals. Size of
circles reflects number of individuals sampled. The Balsas River is shown, as the Balsas River Basin is
believed to be the center of domestication of maize.
as our independent trait for phenotyping analyses. SAS code used for analysis is available at
http://dx.doi.org/10.6084/m9.figshare.1166630.
RESULTS
Genotyping for the Hopscotch insertion
The genotype at the Hopscotch insertion was confirmed with two PCRs for 837 individuals
of the 1,100 screened (Table S1 and Table S2). Among the 247 maize landrace accessions
genotyped, all but eight were homozygous for the presence of the insertion Within
our parviglumis and mexicana samples we found the Hopscotch insertion segregating
in 37 (n = 86) and four (n = 17) populations, respectively, and at highest frequency
within populations in the states of Jalisco, Colima, and Michoac´an in central-western
Mexico (Fig. 2). Using our Hopscotch genotyping, we calculated diVerentiation between
populations (FST) and subspecies (FCT) for populations in which we sampled sixteen
or more chromosomes. We found that FCT = 0, and levels of FST among populations
within each subspecies (0.22) and among all populations (0.23) (Table 1) are similar to
genome-wide estimates from previous studies Pyh¨aj¨arvi, HuVord & Ross-Ibarra, 2013.
Although we found large variation in Hopscotch allele frequency among our populations,
BayEnv analysis did not indicate a correlation between the Hopscotch insertion and
environmental variables (all Bayes Factors < 1).
Studer et al. 2011 Nature Genetics.; Vann et al. 2015 PeerJ
GENETICS ADVANCE ONLINE PUBLICATION 3
nguish maize and teosinte. Both the maize and teosinte
s for the distal component repressed luciferase expression
luc
luc
luc
luc
luc
luc
Hopscotch
mpCaMV
M-dist
T-prox
M-prox
0 0.5 1.0 1.5 2.0
∆M-dist
∆M-prox
ProximalcontrolregionDistal
Constructs and corresponding normalized luciferase expression
nsient assays were performed in maize leaf protoplast. Each
is drawn to scale. The construct backbone consists of the
romoter from the cauliflower mosaic virus (mpCaMV, gray box),
ORF (luc, white box) and the nopaline synthase terminator
). Portions of the proximal and distal components of the
gion (hatched boxes) from maize and teosinte were cloned
ction sites upstream of the minimal promoter. “ ” denotes
on of either the Tourist or Hopscotch element from the maize
Horizontal green bars show the normalized mean with s.e.m.
onstruct.
relative expressionconstruct
26. Wang et al. 2005 Nature
Wang et al 2015 Genetics
1 2 3 4 5
6 7 8 9 10
Figure 1.
Phenotypes. a. Maize ear showing the cob (cb) exposed at top. b. Teosinte e
internode (in) and glume (gl) labeled. c. Teosinte ear from a plant with a m
introgressed into it. d. Close-up of a single teosinte fruitcase. e. Close-up o
teosinte plant with a maize allele of tga1 introgressed into it. f. Ear of maiz
(Tga1-maize allele) with the cob exposed showing the small white glumes a
of maize inbred W22:tga1 which carries the teosinte allele, showing enlarge
h. Ear of maize inbred W22 carrying the tga1-ems1 allele, showing enlarged g
NIH-PAAuthorManuscriptNIH-PAAuthorManuscriptNIH-P
tga1tb1
27. Wang et al. 2005 Nature
Wang et al 2015 Genetics
1 2 3 4 5
6 7 8 9 10
Figure 1.
Phenotypes. a. Maize ear showing the cob (cb) exposed at top. b. Teosinte e
internode (in) and glume (gl) labeled. c. Teosinte ear from a plant with a m
introgressed into it. d. Close-up of a single teosinte fruitcase. e. Close-up o
teosinte plant with a maize allele of tga1 introgressed into it. f. Ear of maiz
(Tga1-maize allele) with the cob exposed showing the small white glumes a
of maize inbred W22:tga1 which carries the teosinte allele, showing enlarge
h. Ear of maize inbred W22 carrying the tga1-ems1 allele, showing enlarged g
NIH-PAAuthorManuscriptNIH-PAAuthorManuscriptNIH-P
tga1tb1
30. 1 2 3 4 5
6 7 8 9 10
gt1 tga1
Wills et al. 2013 PLoS Genetics
tb1
T/T
M/T
M/M
T/T
M/T
M/M
A B
T/T
M/T
M/M
T/T
M/T
M/M
3’ UTR
5’ control region
31. hard sweep
M T N P H R L
GGTCGA ATG ACT GAT CCA CAT CGA CTG TAG
tga1
gt1
tb1
Multiple
Mutations
Standing
Variation
M T G P H R L
GGTAAA ATG ACT GGT CCA CAT CGA CTG TAG
32. Hufford et al. 2012 Nat. Gen.
Chia et al. 2012 Nat. Gen
13 teosinte
23 maize
genomes:
33. Hufford et al. 2012 Nat. Gen.
Chia et al. 2012 Nat. Gen
13 teosinte
23 maize
genomes:
5-10% selected regions intergenic
34. E
D
Mb
nd targets of selection during improvement and/or domestication. (A) Venn diagram
that occur in genomic regions that have evidence for selective sweeps during maize
oexpression networks for three genes (GRMZM2G068436, GRMZM2G137947, and
pression networks are shown. Although the differentially expressed gene (red node) is
ize. However, some parts of the teosinte network are still conserved in maize. (C) Cross-
vidence for a selective sweep that occurs on chromosome 9. The tick marks along the x
ZM2G448355) that was chosen as the candidate target of selection and is differentially
ExpressionGenealogy
teosinte
maize
• ~500 selected regions
• 11M shared vs 3000 fixed SNPs
• Candidates differentially
expressed, decreased
expression variation
selection on regulatory sequence, standing variation
Hufford et al. 2012 Nat. Gen.
Swanson-Wagner et al. 2012 PNAS
40. SA MEX SA MEX
SA MEX SA MEX SA MEX SA
Ear Height Plant Height
Tassel Br. Number
T
Days to Anthesis
SA MEX SA MEX
SA MEX SA MEX
LowlandHighland
Beissinger et al. Unpublished
41. Mexico
Lowland
Mexico
Highland
NA
NB
NC
N1 N2
N2P
tD
tE
tF
NA
NB
NC
N1 N2
N2P
tD
tE
tF
tmex
Nmex
N
tD
tE
tF
NC NA
N1 NC
N2 NC
N2P N2
NC NA
N1 NC
N2 NC
N2P N2
N
N
N
N
N
N
tG
Lowland Highland mexicana Mex
Lowla
Model IA Model IB Mod
Figure 2 Demographic models of maize low
land populations. Parameters in bold were
this study. See text for details.
A HWE cut-off of P < 0.005 was used for e
lation due to our under-calling of heterozygotes
included 18,745 silent SNPs for the Mexican p
Models IA and IB, 14,508 for the S. American p
Model I and 11,305 for the Mexican lowland p
the S. American populations in Model II. We ob
results under more or less stringent thresholds fo
(P < 0.05 ⇠ 0.0005; data not shown), though t
SNPs was very small at P < 0.005. Demograph
Mexico
Lowland
Mexico
Highland
NA
NB
NC
N1 N2
N2P
tD
tE
tF
NA
NB
NC
N1 N2
N2P
tD
tE
tF
tmex
Nmex
NA
NB
NC
N1 N2
tD
tE
tF
N3 N4
NC NA
N1 NC
N2 NC
N2P N2
NC NA
N1 NC
N2 NC
N2P N2
NC NA
N1 1NC
N2 1 NC
N3 2N2
N4 2 N2
N4P N4
tG
N4P
Lowland Highland mexicana Mexico
Lowland
SA
Lowland
SA
Highland
Model IA Model IB Model II
Figure 2 Demographic models of maize low- and high-
land populations. Parameters in bold were estimated in
this study. See text for details.
A HWE cut-off of P < 0.005 was used for each subpopu-
lation due to our under-calling of heterozygotes. In total, we
included 18,745 silent SNPs for the Mexican populations in
Models IA and IB, 14,508 for the S. American populations in
Model I and 11,305 for the Mexican lowland population and
the S. American populations in Model II. We obtained similar
results under more or less stringent thresholds for significance
(P < 0.05 ⇠ 0.0005; data not shown), though the number of
SNPs was very small at P < 0.005. Demographic parameters
were inferred with the software a i (Gutenkunst et al. 2009),
likelihoo
Model IB
by incor
highland
”Mexican”
consistent p
The time
occurs at
is assum
the Mex
between
from the
Model I
America
was used
below).
populatio
after spl
ican low
and the
2. As i
assumed
Estim
able from
mates of
lumis (E
Wright e
of the m
Inference of demographic parameters
Model I Model II
Likelihood 5592.80 Likelihood 4654.79
↵ 0.92 ↵ 1.5
0.38 0.76
1 1
ca Model I Model III
Likelihood 3855.28 Likelihood 8044.71
↵ 0.52 ↵ 1.0
A
Lowlands
Highlands
Observation Expectation Res
Mexico
Model IA
Model IB
–1
Table 2 Inference of demographic parameters
Mexico Model I Model II
Likelihood 5592.80 Likelihood 4654.79
↵ 0.92 ↵ 1.5
0.38 0.76
1 1
South America Model I Model III
Likelihood 3855.28 Likelihood 8044.71
↵ 0.52 ↵ 1.0
0.97 1 0.64
88 2 0.95
54
Population structure
A
B
Lowlands
Highlands
Observatio
Mexico
South Ame
Density
10
–4
0
10
–3
10
–2
10
–1
Observatio
lowlands
highlands
density
Mexico
observed expected
95 samples
~100K SNPs
Takuno et al. 2015 Genetics
47. Beissinger et al. 2016 Nature Plants (pending rev)
nucleotidediversity
distance to nearest substitution (cM)
hard sweeps in genes play minor role in Zea
48. Beissinger et al. 2016 Nature Plants (pending rev)
nucleotidediversity
distance to nearest substitution (cM)
hard sweeps in genes play minor role in Zea
49. Beissinger et al. 2016 Nature Plants (pending rev)
nucleotidediversity
distance to nearest substitution (cM)
hard sweeps in genes play minor role in Zea
50. Wallace et al. 2014 PLoS Genetics
Rodgers-Melnick et al. 2016 PNAS
GWAS candidate SNPs
51. Wallace et al. 2014 PLoS Genetics
Rodgers-Melnick et al. 2016 PNAS
Variance PartitioningGWAS candidate SNPs
52. how to adapt: Zea mays
M T G P H R L
GGTAAA ATG ACT GGT CCA CAT CGA CTG TAG
noncoding/regulatory variation
multiple
mutations
“soft” sweeps
standing
variation
53. Sattah et al. 2011 PLoS Gen.
Williamson et al. 2014 PLoS Gen
Hernandez et al. 2011 ScienceRoss-Ibarra et al. 2009 Genetics
54. Sattah et al. 2011 PLoS Gen.
Williamson et al. 2014 PLoS Gen
Hernandez et al. 2011 ScienceRoss-Ibarra et al. 2009 Genetics
55. Sattah et al. 2011 PLoS Gen.
Williamson et al. 2014 PLoS Gen
Hernandez et al. 2011 Science
diversity
distance from substitution
Ross-Ibarra et al. 2009 Genetics
56. Sattah et al. 2011 PLoS Gen.
Williamson et al. 2014 PLoS Gen
Hernandez et al. 2011 Science
diversity
distance from substitution
20% nonsyn. adaptive 10% nonsyn. adaptive
50% nonsyn. adaptive 40% nonsyn. adaptive
Ross-Ibarra et al. 2009 Genetics
57. Sattah et al. 2011 PLoS Gen.
Williamson et al. 2014 PLoS Gen
Hernandez et al. 2011 Science
diversity
distance from substitution
Ross-Ibarra et al. 2009 Genetics
µ ∝ 2,500 Mbp µ ∝ 3,100 Mbp
µ ∝ 130 Mbp µ ∝ 220 Mbp
58. Pyhäjärvi et al. GBE 2013
enrichment
no<———>yes
larger genomes enriched in noncoding
adaptive variants
intergenic
synonymous
nonsynonymous
enrichment
intergenic<———>coding
Hancock et al 2011 Science
Fraser et al. 2013 Gen. Research
59. Pyhäjärvi et al. GBE 2013
larger genomes enriched in noncoding
adaptive variantsenrichment
intergenic<———>coding
excessadaptiveSNPs
Hancock et al 2011 Science
Fraser et al. 2013 Gen. Research
63. Mu
KNOTTED1
kn1
Greene, et al., 1994
http://pmb.berkeley.edu/sites/default/files/users/Knotted1%20mutant.jpgDoebley 2004, Studer et al., 2011
tb1
Hopscotch
ZmCCT
CACTA
Yang et al., 2013
69. Fang et al. Genetics 2012
Pyhäjärvi et al. GBE 2013
Figure S4 LD in chromosome 9 among mexicana populations based on SNPs with minor
allele frequency >0.1.
Inv9d
Inv9e
70. Fang et al. Genetics 2012
Pyhäjärvi et al. GBE 2013
0.0
0.4
0.8
0 1000 2000
Elevation (m)
InversionFrequency
Inv4n
Figure S4 LD in chromosome 9 among mexicana populations based on SNPs with minor
allele frequency >0.1.
Inv9d
Inv9e
71. Fang et al. Genetics 2012
Pyhäjärvi et al. GBE 2013
0.0
0.4
0.8
0 1000 2000
Elevation (m)
InversionFrequency
Inv4n
Figure S4 LD in chromosome 9 among mexicana populations based on SNPs with minor
allele frequency >0.1.
Inv9d
Inv9e
Inv1n
72. Lauter et al. 2004 Genetics
Inv4n
mexicana parviglumis
Figu
freq
tion
perc
by a
ted
twe
5%
Fang et al. Genetics 2012
Hufford et al. PLoS Genetics 2013
culm diameter
macrohairs, anthocyanin
Inv1n
78. Bilinski et al. In Prep
2.50
2.75
3.00
3.25
3.50
3.75
MH ML SAH SAL mexicana parviglumis
1CGenomeSize(Gb)
Altitude
highland
lowland
SAL mexicana parviglumis
Altitude
highland
lowland
79. Bilinski et al. In Prep
2.50
2.75
3.00
3.25
3.50
3.75
MH ML SAH SAL mexicana parviglumis
1CGenomeSize(Gb)
Altitude
highland
lowland
SAL mexicana parviglumis
Altitude
highland
lowland
2.50
2.75
3.00
3.25
3.50
3.75
MH ML SAH SAL mexicana parviglu
1CGenomeSize(Gb)
2.50
2.75
3.00
3.25
3.50
3.75
MH ML SAH SAL mexicana parviglumis
1CGenomeSize(Gb)
Altitude
highland
lowland
2.50
2.75
3.00
3.25
3.50
3.75
MH ML SAH SA
1CGenomeSize(Gb)
80. Bilinski et al. In Prep
2.50
2.75
3.00
3.25
3.50
3.75
MH ML SAH SAL mexicana parviglumis
1CGenomeSize(Gb)
Altitude
highland
lowland
SAL mexicana parviglumis
Altitude
highland
lowland
2.50
2.75
3.00
3.25
3.50
3.75
MH ML SAH SAL mexicana parviglu
1CGenomeSize(Gb)
2.50
2.75
3.00
3.25
3.50
3.75
MH ML SAH SAL mexicana parviglumis
1CGenomeSize(Gb)
Altitude
highland
lowland
2.50
2.75
3.00
3.25
3.50
3.75
MH ML SAH SA
1CGenomeSize(Gb)
81. mixed model for selection on genome size
altitudemean
slope (selection)
kinshipgenome size
error
β1 < 0
11MB decrease per 100 meter gained
Bilinski et al. In Prep
85. Rayburn et al. 1994 Plant Breeding
Francis et al. 2008. Ann. Bot.
excluded. Indeed, if we ignore the marked dis
of the y-axis caused by their inclusion, then the n
effect is strong for all species regardless of phyl
test the rigour of these hypotheses would requi
plug the gap between Trillium grandiflorum
majority of C-value/cell cycle times analysed he
Separate plots for diploids and polyploids show
nucleotypic effect on CCT in diploids (Fig. 3;
Removing the five diploid outliers (.25 pg) re
slope (b ¼ 0.27) by approximately four-fold
regression continued to be significant (P , 0.
the polyploids, a nucleotypic effect on CCT
detected (Fig. 3; Table 2); however, removing the
ploid outliers rendered the regression non-signifi
0.03x 2 13.5). This confirms previous work in
slope/rate of increase in CCT with increasing
higher in diploids than in autopolyploids (Eva
1972). With the exception of Scilla sibirica, CC
FIG. 3. DNA C-value (pg) and cell cycle time (h) in the roo
istem of a range of diploid and polyploid angiosperms. See
regression analyses.
2. DNA C-value (pg) and cell cycle time (h) in the root apical mer-
m of a range of (A) eudicots and monocots (n ¼ 110), and (B) eudicots
(n ¼ 60). See Table 2 for regression analyses.
LE 2. Regression analyses of all data presented in
s. 2–4 together with the percentage variance accounted
by the regression (R2
), the level of probability (P) for
each regression
late flowering
early flowering
0
10
20
30
100 105 110
DNA
plants
cycle
0
6
smaller genome, faster development?
87. 2.50
2.75
3.00
3.25
3.50
3.75
MH ML SAH SAL mexicana parviglumis
1CGenomeSize(Gb)
Altitude
highland
lowland
2.50
2.75
3.00
3.25
3.50
3.75
MH ML SAH SAL mexicana parviglumis
1CGenomeSize(Gb)
Altitude
highland
lowland
Bilinski et al. In Prep
88. 2.50
2.75
3.00
3.25
3.50
3.75
MH ML SAH SAL mexicana parviglumis
1CGenomeSize(Gb)
Altitude
highland
lowland
2.50
2.75
3.00
3.25
3.50
3.75
MH ML SAH SAL mexicana parviglumis
1CGenomeSize(Gb)
Altitude
highland
lowland
2.50
2.75
3.00
3.25
3.50
3.75
MH ML SAH SAL mexicana parviglumis
1CGenomeSize(Gb)
Altitude
highland
lowland
Bilinski et al. In Prep
89. 2.50
2.75
3.00
3.25
3.50
3.75
MH ML SAH SAL mexicana parviglumis
1CGenomeSize(Gb)
Altitude
highland
lowland
2.50
2.75
3.00
3.25
3.50
3.75
MH ML SAH SAL mexicana parviglumis
1CGenomeSize(Gb)
Altitude
highland
lowland
2.50
2.75
3.00
3.25
3.50
3.75
MH ML SAH SAL mexicana parviglumis
1CGenomeSize(Gb)
Altitude
highland
lowland
Bilinski et al. In Prep
90. • Adaptation in maize occurs from standing variation
and targets regulatory variants
• Large genomes may have more targets, more standing
variation, and more regulatory adaptation
• Adaptation in complex plant genomes likely involves
many kinds of variation including transposable
elements, inversions, copy number variation, and even
genome size?
Evolutionary Genetics in a
Complex Genome
Kew C-Value Database
91. photo by lady_lbrty
Acknowledgments
Maize Diversity Group
Peter Bradbury
Ed Buckler
John Doebley
Theresa Fulton
Sherry Flint-Garcia
Jim Holland
Sharon Mitchell
Qi Sun
Doreen Ware
Collaborators
CSI Davis
Nathan Springer
Lab Alumni
Tim Beissinger (USDA-ARS, Mizzou)
Kate Crosby (Monsanto)
Matt Hufford (Iowa State)
Tanja Pyhäjärvi (Oulu)
Shohei Takuno (Sokendai)
Joost van Heerwaarden (Wageningen)