Genetic Linkage Analysis

Lecture 12

Genetic Linkage Analysis and
Map Construction
1

Experiments with Plant Hybrids (1866)
Seed shape: 5474 round vs 1850 wrinkled
Cotyledon color: 6022 yellow vs 2001 green
Seed coat color: 705 grey-brown vs 224
white
Pod shape: 882 inflated vs 299 constricted
Unripe pod color: 428 green vs 152 yellow
Flower position: 651 axial vs 207 terminal
Stem length: 787 long (20-50cm) vs 277
short (185-230cm)
Rediscovered in 1900

Ear length of maize (East 1911)

P1: 7cm; P2: 17cm
One locus
a=(17-7)/2=5; F2: 1/4 aa (7) + 2/4 Aa (12) + 1/4 AA (17)
Two locus
a=(17-7)/4=2.5
F2: 1/16 (7) + 4/16 (9.5) + 6/16 (12) + 4/16 (14.5) +1/16 (17)
5

P1 ka P2 ka

2
(P1 P2 )
k 1
8[VF2 2 (VP1 VP2 )]

1 2 1 2
VA 2 a 2 ka

2
(P1 P )2
k
8V A

Mendel and Fisher
Annuals of Science 1:115-
close to the values that Mendel expected under his theory
that there must have been some manipulation, or
omission, of data
Dominant trait: 1/3 AA + 2/3 Aa
Family size: 10
Non-segregating (AA) :
Segregating (Aa) = 1:2 (Mendel)
Fisher: Pro {Aa family classified as
AA} = 0.75^10=0.0563
Pro {Non-segregating (AA)}
=2/3*(1-0.0563)=0.6291
Non-segregating (AA) :
Segregating (Aa) = 0.3709 : 0.6291
= 1 : 1.6961 13

Genetic populations and pair-
wise linkage analysis

15

Populations handled in QTL IciMapping
Parent P1 Parent P2 Legends

Hybridization
F1

Selfing
1. P1BC1F1 7. F2 2. P2BC1F1

Repeated selfing
9. P1BC2F1 13. P1BC1F2 8. F3 14. P2BC1F2 10. P2BC2F1

Doubled haploids
15. P1BC2F2 16. P2BC2F2

11. P1BC2RIL 5. P1BC1RIL 4. F1RIL 6. P2BC1RIL 12. P2BC2RIL BC3F1, BC4F1 etc.

P1BC2F1 P1BC1F1 F1 P2BC1F1 P2BC2F1 Marker-assisted
selection

19. P1BC2DH 17. P1BC1DH 3. F1DH 18. P2BC1DH 20. P2BC2DH CSS lines or
Introgression lines

P1 × CP P2 × CP P3 × CP Pn × CP CP=common parent

RIL family 1 RIL family 2 RIL family 3 RIL family i RIL family n

One NAM population

Example: 10 RILs in a rice population
(Linkage map of Chr. 5)
Grain
Marker C263 R830 R3166 XNpb387 R569 R1553 C128 C1402 XNpb81 C246 R2953 C1447 width
(mm)

Position
0.0 3.5 8.5 19.5 32.0 66.6 74.1 78.6 81.8 91.9 92.7 96.8
(cM)
RIL1 0 0 0 0 0 0 0 0 0 0 0 0 2.33
RIL2 2 2 2 2 2 0 0 0 0 2 2 2 1.99
RIL3 0 2 2 2 2 2 2 2 2 2 2 2 2.24
RIL4 0 0 0 0 0 0 2 2 2 2 2 2 1.94
RIL5 0 0 0 0 0 2 2 0 0 0 0 0 2.76
RIL6 0 0 0 2 2 2 2 2 2 2 2 2 2.32
RIL7 0 0 0 0 0 0 0 0 0 0 0 0 2.32
RIL8 2 2 0 2 2 0 0 0 0 2 2 2 2.08
RIL9 0 0 0 0 2 2 0 0 0 0 0 0 2.24
17

RIL10 0 0 0 0 2 2 0 0 0 0 0 0 2.45

Genetic markers in linkage analysis
Morphological traits

hybridization experiments
Cytogenetic and bio-chemistry
markers (e.g. isozyme)
DNA molecular markers
RFLP, SSR, SNP etc.

The four gametes (haplotypes) of an F1
P1: AABB P2: aabb

A B a b
A B a b

F1: AaBb

A B
a b
Meiosis

A B A b a B a b
(1-r)/2 r/2 r/2 (1-r)/2
19
Parental type Recombinant Recombinant Parental type
type type

Expected genotypic frequency in backcross
and DH populations
P1: AABB; P2: aabb

20

MLE of recombination frequency
Likelihood function
n1 n2 n3 n4
n! 1 1 1 1
L (1 r ) r r (1 r ) C (1 r ) n1 n4
( r ) n2 n3

n1!n2 !n3! n4 ! 2 2 2 2

Logarithm of likelihood
ln L ln C (n1 n4 ) ln(1 r ) (n2 n3 ) ln r
n2 n3 n2 n3
r
MLE of r n1 n2 n3 n4 n

Fisher information
d 2 ln L n1 n4 n2 n3 n
I E( 2
) E
d r (1 r ) 2 r2 r (1 r )

Variance of estimated r Vr
1 r (1 r )
I n

Significance test of linkage
Null hypothesis H0: r = 0.5 (no genetic linkage, or
locus A-a and B-b are independent)
Alternative hypothesis HA

Likelihood ratio test (LRT) or LOD score
L(r 0.5) 2
LRT 2 ln[ ]~ (df 1)
L(r )
L(r )
LOD
L(r 0.5)

An example P1BC1 population
Genotypes of two inbred parents P1 and P2
are AABB and aabb
Observed samples of the four genotypes in
P1BC1
AABB 162 AABb 40 AaBB 41 AaBb
158
40 41 81
r 20.20%
162 40 41 158 401
r (1 r ) 4
Vr 4.02 10 23
n

Test of linkage
Null hypothesis H0: r = 0.5
Alternative hypothesis HA

L( r ) (1 r ) n1 n4 r n2 n3
6.3 10153
L( r 0.5) ( 1 ) n1 n2 n3 n4
4

Likelihood ratio test (LRT) (P<0.0001) and LOD
score
L( r )
LRT 2 * ln[ ] 708.27
L( r 0.5)
L(r )
LOD log[ ] 153.80 24
L (r 0.5)

Genotypic frequencies in RIL
populations, compared with DH
DH Theoretical RIL Theoretical
population frequency population frequency
AABB f1=(1-r)/2 AABB f1=(1-R)/2

AAbb f2=r/2 AAbb f2=R/2

aaBB f3=r/2 aaBB f3=R/2

aabb f4=(1-r)/2 aabb f4=(1-R)/2

25
R=2r/(1+2r)

Parent type or
RIL Marker 1 Marker 2
recombinant
C263 XNpb387 n1=6
RIL1 0 or A 0 or A P1 type
n2=2
n3=0
RIL2 2 or B 2 or B P2 type
n4=2
RIL3 0 or A 2 or B Recombinant
RIL4 0 or A 0 or A P1 type R=2/10=0.2
RIL5 0 or A 0 or A P1 type r=0.125
RIL6 0 or A 2 or B Recombinant
RIL7 0 or A 0 or A P1 type LRT=17.72 (P=2.56 10-5)
RIL8 2 or B 2 or B P2 type LOD=3.85

Expected genotypic
frequencies in F2 populations

MLE of r in F2: dominant markers
2
Logarithm of the likelihood ratio k (1 r )
ln L C n1 ln(3 2r r 2 ) (n3 n7 ) ln(2r r2) n9 ln(1 2r r2)
C n1 ln(2 k ) (n3 n7 ) ln(1 k ) n9 ln k
MLE of r
2 ( 2n 3n1 n9 ) ( 2n 3n1 n9 ) 2 n n9
k (1 r )
2n
Variance of the estimated r
(1 k )(2 k ) (2r r 2 )(3 2r r 2 )
Vr
2n(1 2k ) 2n(3 4r 2r 2 )

MLE of r in F2: co-dominant markers
(Newton-Raphson algorithm)
Log-likelihood function
ln L ln C (2n1 2n9 n2 n4 n6 n8 ) ln(1 r )
( n2 n4 n6 n8 2n3 2n7 ) ln r n5 ln(1 2r 2r 2 )

The first-order derivative of LogL
f'(r) ) d dr L 2n 2n n 1n n n n n n rn 2n 2n 1n (24rr 22r)
ln
r
1 9 2 4 6 8 2 4 6 8 3 7 5
2

The second-order derivative of LogL
2 2
d ln L 2 n 2n n n n n n n n n 2n 2n n ( 4r 4r )
f''(r) d r
) 2
( r 1)
1 9
r
2
2
4
(1 2r 2r )
6 8 2 4 6
2
8 3 7 5
2 2

The iteration algorithm:
ri+1 = ri - f'(ri)/f''(ri)

MLE of r in F2: co-dominant
markers (EM algorithm)
EM for expectation and maximization

E-step: for an initial r0, calculate the probability of
crossover in each marker type
M-step: Update r, and repeat from the E-step

1
r' n nk Pk ( R | G)
k

Expected probability of crossover

r= [n1 0+ n2 0.5+ n3 1 n8 0.5+ n9 0]/n

Estimated r after 3 EM iterations (r0=0.5)

Co-dominant markers in other
populations

R=2r/(1+2r)

More populations (e.g. BC1F2, F3 etc):
Generation transition matrix of

Distortion has little effect on
linkage analysis!
DH pop Theo. Freq. Distortion Freq. in distortion
AABB f1=(1-r)/2 (1-r)/2 (1-r)/(1+s)
AAbb f2=r/2 r/2 r/(1+s)
aaBB f3=r/2 s r/2 r s/(1+s)
aabb f4=(1-r)/2 s (1-r)/2 (1-r) s/(1+s)
Sum 1 (1+s)/2 1

r r /(1 s) r s /(1 s) r (1 s) /(1 s) r

Three-point analysis and linkage
map construction

38

Linkage analysis of three markers
r13 r12 r23 21 r12 r23
When 0 interference),
(no
(1 r13 ) (1 r12 )(1 r23 ) r12 r23
r13 r12 (1 r23 ) (1 r12 ) r23 r12 r23 2r12 r23
When 1 (complete interference),
r13 r12 r23
The order of the three loci can be determined after
linkage analysis (3!/2=3 potential orders)
39
1 2 3, or 1 3 2, or 2 1 3

Mapping distance and
recombination frequency
Mapping distance m13 m12 m23
Unit of mapping distance
M (Morgan) or cM (centi-Morgan), 1M=100cM
The function of mapping distance on
recombination frequency (Mapping
function):
m f (r )
40

Common mapping functions
Morgan function (complete interference)

In M: m =r (M)
In cM: m =r 100 (cM)
Haldane function (no interference)
1 2m
In M: m f (r ) 2 ln(1 2r ) r 1
2 (1 e )
m / 50
In cM: m f (r ) 50 ln(1 2r ) r 1
2 (1 e )

Kosambi function (interference depends on length of interval)
4m
In M: m
1 1 2r
ln r
1 e 1
4m
4 1 2r 2 e 1
m / 25
1 2r 1e 1
m 25 ln r 41
In cM: 1 2r 2 em / 25 1

Comparison of the three functions
Mapping distance (cM)
(M)

42

Recombination frequency

Three steps in linkage map construction
Step 1: Grouping. Grouping can be based on
(i) a threshold of LOD score
(ii) a threshold of marker distance (cM)
(iii) anchor information
Step 2: Ordering. Three ordering algorithms are
(i) SER: SERiation (Buetow and Chakravarti, 1987. Am J Hum
Genet 41:180 188)
(ii) RECORD: REcombination Counting and ORDering (Van Os
et al., 2005. Theor Appl Genet 112: 30 40)
(iii) nnTwoOpt: nearest neighbor was used for tour construction,
and two-opt was used for tour improvement, similar to Travelling
Salesman Problem (TSP) (Lin and Kernighan, 1973. Oper. Res.
21: 498 516.

Three steps in linkage map construction
Due to the large number of markers (n), it is impossible
to compare all possible orders (say n=50, possible
orders are n!/2=1.52x1064). Orders from the above
algorithms are regional optimizations.
Step 3: Rippling. Five rippling criteria are
(i) SARF (Sum of Adjacent Recombination Frequencies)
(ii) SAD (Sum of Adjacent Distances)
(iii) SALOD (Sum of Adjacent LOD scores)
(iv) COUNT (number of recombination events)

The MAP functionality in QTL
IciMapping

45

Interface of the MAP functionality

A. Map of one chromosome B. Map of all chromosomes

Map outputs:
Linkage map for each
chromosome (A) or all
chromosomes (B)

An example map of seven
chromosomes or groups

48

Linkage map and physical map
Species Size of haploid Size of linkage kb/cM
genome (kb) map (cM)

Yeast 2.2 104 3700 6
Neurospora 4.2 104 500 80
Arabidopsis 7.0 104 500 140
Drosophila 2.0 105 290 700
Tomato 7.2 105 1400 510
Human 3.0 106 2710 1110
Wheat 1.6 107 2575 6214
Rice 4.4 105 1575 279
49
Corn 3.0 106 1400 2140

What is QTL Mapping?
The procedure to map individual genetic factors
with small effects on the quantitative traits, to
specific chromosomal segments in the genome
The key questions in QTL mapping studies are:
How many QTL are there?
Where are they in the marker map?
How large an influence does each of them
have on the trait of interest?

Grain
Marker C263 R830 R3166 XNpb387 R569 R1553 C128 C1402 XNpb81 C246 R2953 C1447 width
(mm)

Position
0.0 3.5 8.5 19.5 32.0 66.6 74.1 78.6 81.8 91.9 92.7 96.8
(cM)
RIL1 0 0 0 0 0 0 0 0 0 0 0 0 2.33
RIL2 2 2 2 2 2 0 0 0 0 2 2 2 1.99
RIL3 0 2 2 2 2 2 2 2 2 2 2 2 2.24
RIL4 0 0 0 0 0 0 2 2 2 2 2 2 1.94
RIL5 0 0 0 0 0 2 2 0 0 0 0 0 2.76
RIL6 0 0 0 2 2 2 2 2 2 2 2 2 2.32
RIL7 0 0 0 0 0 0 0 0 0 0 0 0 2.32
RIL8 2 2 0 2 2 0 0 0 0 2 2 2 2.08
RIL9 0 0 0 0 2 2 0 0 0 0 0 0 2.24
RIL10 0 0 0 0 2 2 0 0 0 0 0 0 2.45

Bi-parental mapping populations (linkage
mapping)
Temporary population: F2 and BC
Permanent population: RIL, DH, CSSL
Secondary population
Association mapping
Natural populations: human and animals

Single marker analysis (Sax 1923; Soller et al. 1976)
The single marker analysis identifies QTLs based on the difference
between the mean phenotypes for different marker groups, but cannot
separate the estimates of recombination fraction and QTL effect.
Interval mapping (IM) (Lander and Botstein 1989)
IM is based on maximum likelihood parameter estimation and provides
a likelihood ratio test for QTL position and effect. The major
disadvantage of IM is that the estimates of locations and effects of QTLs
may be biased when QTLs are linked.
Regression interval mapping (RIM)
(Haley and Knott 1992; Martinez and Curnow 1992 )
RIM was proposed to approximate maximum likelihood interval mapping
to save computation time at one or multiple genomic positions.

Composite interval mapping (CIM) (Zeng 1994)
CIM combines IM with multiple marker regression analysis,
which controls the effects of QTLs on other intervals or
chromosomes onto the QTL that is being tested, and thus
increases the precision of QTL detection.

Multiple interval mapping (MIM) (Kao et al. 1999)
MIM is a state-of-the-art gene mapping procedure. But
implementation of the multiple-QTL model is difficult, since the
number of QTL defines the dimension of the model which is
also an unknown parameter of interest.

Bayesian model (Sillanpää and Corander 2002)
In any Bayesian model, a prior distribution has to be
considered. Based on the prior, Bayesian statistics derives the
posterior, and then conduct inference based on the posterior
distribution. However, Bayesian models have not been widely
used in practice, partially due to the complexity of
computation and the lack of user-friendly software.

mm Mm MM mm Mm MM

A. B.
QTL QTL

Backcrosses (P1BC1 and P2BC1)
of P1: MMQQ and P2: mmqq
BC1 BC2

Genotypic Genotypic
Genotype Frequency Genotype Frequency
value value

1 1
MMQQ 2 (1 r ) m+a MmQq 2 (1 r ) m+d

1 1
MMQq 2 r m+d Mmqq 2 r m-a

1 1
MmQQ 2 r m+a mmQq 2 r m+d

1 1
MmQq 2 (1 r ) m+d mmqq 2 (1 r ) m-a

Two marker types:
MM (1 r ) MMQQ r MMQq

(1 r )(m a) r (m d ) m (1 r )a rd

Mm r MmQQ (1 r ) MmQq

r (m a) (1 r )(m d ) m ra (1 r )d
Difference in phenotype between the two types

MM Mm (1 2r )(a d )

Linear model (j=1 2 n)
yi b0 b* x* e j
j

b* represent QTL effect x * is the indicator
j
variable (0 or 1) for QTL genotype
Likelihood profile
Support interval: One-LOD interval

P1: Mi Q Mi +1 P2: mi q mi +1

Mi Q Mi +1 mi q mi +1

F1: Mi Q Mi +1 P1: Mi Q Mi +1

mi q mi +1 Mi Q Mi +1

Mi Q Mi +1 Mi Q Mi +1 Mi Q Mi +1 Mi Q Mi +1

Mi Q Mi +1 Mi Q mi +1 mi q Mi +1 mi q mi +1

Mi Q Mi +1 Mi Q Mi +1

Mi q mi +1 mi Q Mi +1

1 4

Assumption: No more than one QTL
per chromosome or linkage group

Large confidence interval
Biased effect estimation

Composite interval mapping (CIM)
(Zeng 1994)

In the algorithm of CIM, both QTL effect at the
current testing position and regression coefficients
of the marker variables used to control genetic
background were estimated simultaneously in an
expectation and maximization (EM) algorithm.
Thus, this algorithm could not completely ensure
that the effect of QTL at current testing interval
was not absorbed by the background marker
variables and therefore may result in biased
estimation of the QTL effect.

Theoretical basis of ICIM
m
G ajg j aa jk g j g k
j 1 j k

E ( g j | X) j xj j xj 1

E( g j gk | X) j k x j xk j k x j xk 1 j k x j 1xk j k x j 1xk 1

m 1
yi b0 b j xij b jk xij xik ei
j 1 j k

One-dimensional scanning (interval mapping)

yi yi b j xij
j k ,k 1

Two-dimensional scanning (interval mapping)

yi yi br xir brs xir xis
r j , j 1,k ,k 1 r j, j 1
s k ,k 1

40 2
1.5
30
LOD score
1
0.5

Effect
20
0
10 -0.5 11111111111222222222233333333334444444444
-1
0 -1.5
11111111111222222222233333333334444444444 -2
Scanning posoition along the genome Scanning posoition along the genome

80 3
2
60
LOD score

1

Effect
40 0
-1 11111111111222222222233333333334444444444
20
-2
0 -3
11111111111222222222233333333334444444444 -4

70 1.5
60 1
LOD score

50
40 0.5
Effect

30 0
20
-0.5 11111111111222222222233333333334444444444
10
0 -1
11111111111222222222233333333334444444444 -1.5

Detecting
epistasis where
the interacting

significant
additive effects

One-locus model in F2
One-locus model: G aw dv
where is mean of the two homozygous
genotypes QQ and qq, a is the additive
effect, d is the dominance effect . w and
v are the indicators for genotypes at the
QTL, valued at 1 and 0 for QQ, 0 and 1
for Qq, and -1 and 0 for qq, respectively

The expected genotypic value of an
individual with known marker types

E (G | x1 , x2 , y1 , y2 ) a E ( w | x1 , x2 , y1 , y2 )
d E (v | x1 , x2 , y1 , y2 )

Probability of the three QTL
genotypes under given marker types

Left Right QQ (w=1, v=0) Qq (w=0, v=1) qq (w=-1, v=0)
marker marker (m+a) (m+d) (m-a)
2 2 1 1 2 2
AA BB
1
4 (1 r1 ) (1 r2 ) 2 1
r (1 r1 )r2 (1 r2 ) r r
4 1 2

2
AA Bb
1
2 (1 r1 ) 2 r2 (1 r2 ) 1
r (1 r1 )(1 r2 )
2 1
2 1
r (1 r1 )r2
2 1
2 1
r r (1 r2 )
2 1 2

1 2 2
(1 r1 ) r2 1 1 2
AA bb 4 r (1 r1 )r2 (1 r2 )
2 1 r (1 r2 ) 2
4 1

Estimation of marker class mean

Indicator
Marker for marker E (w | x1 , x2 , y1 , y2 ) E (v | x1 , x2 , y1 , y2 ) Genetic mean
n Frequency
class of the class
x1 x2 y1 y2

AABB n1 1
4 (1 r ) 2 1 1 0 0 f1 g1 f1a g1d

1 f2a g2d
AABb n2 2 r (1 r ) 1 0 0 1 f2 g2

1 2
AAbb n3 4 r 1 -1 0 0 f3 g3 f 3a g3d

1 2r1r2 /(1 r ) f1 2r1 (1 r1 )r2 (1 r2 ) /(1 r ) 2 g1

[(1 2r1 )r2 (1 r2 )] /( r r ) 2
f2 r1 (1 r1 )(1 2r2 2r22 ) /( r r 2 ) g2
(r2 r1 ) / r f3 2r1 (1 r1 )r2 (1 r2 ) / r 2 g3

Relationship between marker
class mean and marker effect
(including marker interactions)
f1a g1d 1 1 1 0 0 1 0 0 0 (d ) d

f 2a g 2d 1 1 0 0 1 0 1 0 0 (a ) A1
f 3a g 3d 1 1 1 0 0 1 0 0 0 (a ) A2
f 4a g 4d 1 0 1 1 0 0 0 1 0 (d ) D1
g5d 1 0 0 1 1 0 0 0 1 (d ) D2
f 4a g 4d 1 0 1 1 0 0 0 1 0 (d ) AA12
f 3a g 3d 1 1 1 0 0 1 0 0 0 AD12
f 2a g 2d 1 1 0 0 1 0 1 0 0 DA12
f1a g1d 1 1 1 0 0 1 0 0 0 (d ) DD12

Relationship between marker
effects and QTL effects
1
(d ) d 2
( g1 g3 )d
(a) A1 f2a
1
(a) A2 2 ( f1 f 3 )a
1 1
(d ) D1 ( g
2 1 2
g3 g 4 )d
1 1
(d ) D2 ( g g2
2 1 2 g 3 )d
(d ) AA12 1
2 ( g1 g 3 )d
AD12 0
DA12 0
(d ) DD12 ( 1 g1 g 2
2
1
2
g 3 g 4 g 5 )d

The linear model of genotypic
values on markers in F2

E(w | x1 , x2 , y1 , y2 ) x
1 1 2 2 x
E (v | x1 , x2 , y1 , y2 ) 1 1 y 2 y2
xx
12 1 2 yy
12 1 2

The linear model of genotypic
values on markers in F2

E (G | x1 , x2 , y1 , y2 ) (a) A1 x1 (d ) D1 y1 (a) A2 x2 (d ) D2 y2

(d ) AA12 x1 x2 (d ) DD12 y1 y2

Properties of the linear model in F2
The additive and dominance effects of the
flanked QTL are completely absorbed by the
six variables in the model above.
Interactions between marker variables may be
declared as interaction between QTL by
mistake when using ANOVA.
But from our analysis, interactions between
marker variables can be caused simply by
dominance effects of QTL .

Multiple QTL model in F2
For multiple QTL, assume there are m
QTL located on m intervals defined by
m+1 markers on one chromosome, then
the genotypic value of an F2 individual is
defined as:
m
G [a j w j d jv j ]
j 1

The linear model in F2 under
multiple QTL
The genotypic value of an F2
individual with known marker types
can be re-organized as:
m 1 m 1
E (G ) j xj j yj
j 1 j 1
m m

j, j 1 xjxj 1 j, j 1 yj yj 1
j 1 j 1

The linear model for QTL
mapping in F2

m 1 m 1
P E (G ) j xj j yj
j 1 j 1
m m

j 1 j 1

Property of the linear model
for QTL mapping in F2

ICIM (Inclusive Composite
Interval Mapping) in F2

Pi Pi [ j xij j yij ]
j k ,k 1

[ j , j 1 ijx xi , j 1 j, j 1 yij yi , j 1 ]
j k

Hypothesis test of QTL
mapping in F2
The two hypotheses used to test the existence
of QTL at the scanning position are:
vs. H 0 : 1 2 3
H A : at least two of 1 , 1 and 3 are not equal
The logarithm likelihood under HA is
9 3
2
LA log[ jk f ( Pi ; k , )]
j 1 i Sj k 1
where S j denotes individuals belonging to the j th marker class (j=1,
th
jk k=1, 2, 3) is the proportion of the k QTL genotype in
th
the j class, and f ( ; k , 2 ) is the density function of the normal
2
distribution N ( k , ) .

EM algorithm of QTL mapping
in F2
Use EM algorithm to get the estimation
of 1 , 2 and 3

So the genetic effects in G aw dv
were therefore estimated by
1 1
2 ( 1 3 ) a 2 ( 1 3 ) d 2

EM algorithm of QTL mapping in F2

Parameters under H0 were calculated as:
n n
1 2 1 2
0 n
Pi 0 n
( Pi 0 )
i 1 i 1

From which the maximum likelihood
under H0, and the LOD score between HA
and H0 can be calculated.

QTL distribution models in
simulation

QTL distribution models in
simulation
F2 populations were simulated by
the genetics and breeding
simulation tool of QuLine.
QTL mapping using ICIM was
implemented by the software QTL
IciMapping.

Theoretical marker effects in the
genetic model used in simulation
The expected additive, dominance,
additive by additive, and dominance by
dominance effects of the two flanking
markers associated with each QTL is
shown in the following table.
It indicated that the dominance of a QTL
could complicate the coefficients of the
two markers flanking a QTL, and cause
the interactions between markers.

The expected marker effects in
simulation

Interaction
QTL (d) d (a) A1 (a) A2 (d ) D1 (d ) D2 (d ) AA12 (d ) DD12 variation (%)

QTL1 0.000 0.498 0.498 0.000 0.000 0.000 0.000 0.0
QTL2 0.253 0.000 0.000 0.248 0.248 -0.248 0.243 21.8
QTL3 0.253 0.498 0.498 0.248 0.248 -0.248 0.243 5.7
QTL4 -0.253 0.498 0.498 -0.248 -0.248 0.248 -0.243 5.7
QTL5 0.379 0.498 0.499 0.371 0.371 -0.371 0.364 9.6
QTL6 -0.379 0.498 0.498 -0.371 -0.371 0.371 -0.364 9.6

QTL mapping in simulated F2
populations

QTL LOD PVE True Est. True Est. add. True Est.
score (%) Position Position add. effect dom. dom.
(cM) (cM) effect effect effect
QTL distribution model I
QTL1 16.52 6.67 25 28 1 0.88 0 -0.11
QTL2 7.67 3.27 55 53 0 0.03 1 0.85
QTL3 25.11 11.28 25 24 1 0.86 1 1.08
QTL4 35.46 16.43 55 57 1 0.74 -1 -1.58
QTL5 37.12 16.74 25 26 1 1.05 1.5 1.38
QTL6 28.44 13.16 55 55 1 0.84 -1.5 -1.22

180 individuals
The cross was made in Chengdu, China,
in July 2002 between the indica rice
variety and Nipponbare.
137 SSR markers.
The whole genome was of 2046.2 cM, and
the average marker distance was 17.1 cM.
A number of agronomic traits were
investigated in the field.

QTL mapping in the actual F2
population

QTL distribution
Trait R2 of R2 of Absolute degree of dominance (|d/a|) Total
additive additive and
dominance
(%) <=0.25 (0.25, 0.75] (0.75, 1.25] >1.25
(%)
PH 25.84 51.56 2 1 1 5 9
HD 16.12 41.37 1 1 1 3 6
PL 25.58 61.26 5 3 1 8 17
FL 20.86 40.00 0 2 0 3 5
SPK 25.64 27.09 1 1 1 1 4
TKW 20.11 20.11 2 0 2 1 5
DP 19.45 24.87 1 1 0 1 3
GL 30.69 41.96 1 1 0 0 2
GW 26.63 26.63 2 2 0 0 4
RLW 37.63 45.70 1 3 1 1 6
Total 16 15 7 23 61

PVE distribution
20
18
Frequency across traits

16
14
12
10
8
6
4
2
0

Phenotypic variation explained(%)

Trait QTL Chr Distance to Add Dom LOD PVE(%)
left marker
Plant QPh1-1 1 12 -0.57 -7.98 8.04 12.03
height QPh1-2 1 19.5 -8.59 0.59 15.54 25.57
(Ph) QPh3-1 3 16.9 4.35 -4.86 6.51 13.30
QPh3-2 3 11.4 -4.69 -1.00 5.04 6.84
QPh4 4 13.7 -3.56 -2.09 4.61 5.53
QPh5 5 13 -0.44 -4.48 3.13 3.86
QPh6 6 6.2 -0.79 -5.05 3.17 4.96
QPh7 7 7 0.26 6.48 5.27 7.56
QPh12 12 2.4 -1.66 3.93 3.98 5.44
Heading QHd1 1 22.1 1.74 -0.30 3.65 7.27
date (Hd) QHd3 3 19.9 0.88 -3.70 6.04 21.09
QHd4 4 0.2 -0.77 1.85 3.58 5.24
QHd8 8 5.7 -1.41 -1.46 4.79 8.20
QHd10 10 0.3 -1.78 -0.80 4.85 7.21
QHd11 11 6.2 0.15 -3.03 5.71 11.70

Conclusions

m 1 m 1
P E (G ) j xj j yj
j 1 j 1
m m

j 1 j 1

Six methods in BIP
SMA: single marker analysis (Soller et al., 1976. Theor.
Appl. Genet. 47: 35-39)
IM-ADD: the conventional simple interval mapping
(Lander and Botstein, 1989. Genetics 121: 185-199)
ICIM-ADD: inclusive composite interval mapping of
additive (and dominant) QTL (Li et al., 2007. Genetics
175: 361-374. Zhang et al., 2008. Genetics 180: 1177-
1190)
IM-EPI: interval mapping of digenic epistatic QTL
ICIM-EPI: inclusive composite interval mapping of
digenic epistatic QTL (Li et al., 2008. Theor. Appl.
Genet. 116: 243-260)
SGM: selective genotyping mapping (Lebowitz et al.,
1987. Theor. Appl. Genet. 73: 556 562)

Interface of the BIP functionality

LOD profile of ICIM additive mapping
(ICIM-ADD)

Figures of interacting QTL from ICIM
epistatic mapping (ICIM-EPI)

Genetic Linkage Analysis

Recomendados

Recomendados

Más contenido relacionado

Más de FOODCROPS

Más de FOODCROPS (20)

Último

Último (20)

Genetic Linkage Analysis