This document compares the accuracy of genomic selection prediction methods (BLUP and Bayesian) under different scenarios of marker density and number of quantitative trait loci (QTL). It simulated genomes with varying numbers of markers (100, 200, 500) and QTLs (4, 10, 20, 40) and different heritability levels (5%, 10%, 25%). The results showed that the Bayesian method had higher accuracy than BLUP in all scenarios. Accuracy generally increased with more markers and decreased slightly with more QTLs.
1. 73 Honarvar and Nooralvandi
Int. J. Biosci. 2014
RESEARCH PAPER OPEN ACCESS
Considering the accuracy of genomics selection by means of
Bayesian and BLUP methods
Mahmood Honarvar1*
, Tohid Nooralvandi2
1
Department of Animal Science, Shahr-e-Qods Branch, Islamic Azad University, Tehran, Iran
2
Department of Agriculture, Shahr-e-Qods Branch, Islamic Azad University, Tehran, Iran
Key words: Accuracy, Bayesian, BLUP, genomic.
http://dx.doi.org/10.12692/ijb/5.3.73-81 Article published on August 02, 2014
Abstract
We compared the accuracies of two genomic-selection prediction methods as affected by marker density and
quantitative trait locus (QTL) number. Methods used to derive genomic estimated breeding values (GEBV) were
best linear unbiased prediction (BLUP) and a Bayesian (Least Absolute Shrinkage and Selection Operator). In
this study the genome comprised one chromosome of 100 cm. Also considering the number of markers 100, 200
and 500 and the number of QTLs 4, 10, 20 and 40 and heritability of 5, 10 and 25 percent were compared.. In all
scenarios Bayesian was more accurate than BLUP, also increasing the number of QTLs, the evaluation accuracy
decreases slightly which this reduction is greater in the lower heritability.
* Corresponding Author: Mahmood Honarvar Honarvar.mahmood@gmail.com
International Journal of Biosciences | IJB |
ISSN: 2220-6655 (Print) 2222-5234 (Online)
http://www.innspub.net
Vol. 5, No. 3, p. 73-81, 2014
3. 75 Honarvar and Nooralvandi
Int. J. Biosci. 2014
Introduction
Genetics evaluation and estimation of animals'
breeding value are the main sections of most of the
animals' breeding programs to improve theme
genetically. The main aim of animals' breeding
programs is improving genetic features of population.
One of the main sections of each breeding program is
determination of those animals which have better
genetic features to select them as parents of next
generation.
Selection on the basis of quantitative characteristics
that are important from economic aspect traditionally
was according to phenotypic records of individual and
its relatives. Breeding value (BV) is the result of
phenotypic data which mostly achieved by Best
Linear Unbiased Prediction (BLUP) method; this
method first was introduced by Henderson (1984)
(Meuwissen et al., 2001).
Gradually, this method evaluated in format of
different methods of animals' genetic evaluation; in a
way that initially BLUP characteristics and then
univariate and multivariate animal models were
invented. Moreover, accidental regression models
were offered to analyze reported data.
Following improvements in calculation methods and
computers' calculation power, most of the national
genetic evaluation systems for different spacious of
domestic animals were established on the basis of
animal models and accidental regression according to
BLUP characteristics (Merod, 2005).
Henderson made a landmark in the field of animal
breeding by offering complex equations, but selection
on this basis is costly and time consuming.
Traditional methods of genetic evaluation are
dependent on phenotypic and stemma information.
For instance, in most of the field spacious such as
dairy cattle, estimation of breeding value of brood
stocks achieved by results test and according to the
function of bulls' daughters; as a result, it takes time
to gather phenotypic data. This causes to
enhancement of generation distances in one hand and
reduction of genetic improvement in the other hand.
In addition the cast of proofing bulls increases
(Sheffer, 2006). Moreover, there are lots of statistical
models and estimation models which are established
on the basis of in finitesimal theory. In this theory the
basis assumption is that genetic variances of
quantitative characteristics are created by infinitive
numbers of discrete gene locus with few signs. While
resent studies showed that the total number of existed
genes in a limited speacious is between 20000 and
40000; therefore, the numbers of effective genes on a
characteristic is fewer than this number (Ewing and
Green, 2000).
Presence of major genes, which are responsible for
explanation of genetic variance of a quantitative
character, is reported in different animal spacious (Le
Rey et al., 1999). In the other hand, the number of
chromosomes in each spacious is constant and
limited; therefore most of the effective genes on a
character may be located on one or more
chromosomes. As a result, this assumption that lots of
gene with few signs or free recombinant genes in
infinitesimal theory is for from reality (Goddard and
Hayes, 2001).
By development of molecular genetics, the
opportunity using data in DNA level was provided to
evaluate breeding values more correctly and improve
animal genetic more quickly (Georges et al., 1995).
One of the reasons for using molecular genetics in
animals and plants researches is this belief that
genetic improvement by DNA data is more quickly
than traditional methods. Almost in 1990 breeding
program directed to molecular genetics from
quantitative genetic. This approach happened in 2
stages, first, recognition of those markers that are
related to QTL, and second, application of such
markers in MAS (Marker Assisted Selection). This
method provided the possibility of determining the
genotype without phenotypic records (Misztal, 2006).
Phenotypic records and information are used along
with markers' data for selecting animals in a breeding
program; such selection would be called Marker
Assisted Selection (MAS) (Goddard, 2006). Even if
4. 76 Honarvar and Nooralvandi
Int. J. Biosci. 2014
involved genes have not been identified, QTL
information can increase selection duration and
provide proper technical and economic opportunities
for using MAS in dairy cattle industry. Application of
MAS of markers would be useful if recording from
traits is hard and costly (Boichard et al., 2006). There
are problems in application of MAS, for example,
liked markers with QTL of a trait which are identified
can't explain all the trait genetic variances. Therefore,
always it is needed to consider polygene section in
evaluation of genetic value of animal. As a result,
always it in needed to gather phenotypic information
to evaluate this part (Grossman and Fernando, 1989).
Various studies had been done in MAS field, but its
application has been encountered with limitations.
Recent improvements towards discovery of single
nucleotide polymorphism (SNP) and technology of
genotype determination with high operating power
made opportunity for using SNP markers with high
concentration to predict breeding values, and this
method led to formation of genomic selection (GS)
(Meuwissen, 2009).
Improvements in genomic selection are related to
prediction ability of GEBVs with high level of
accuracy for several generations without
determination of extra phenotype. By improvement of
molecular genetics, using dense markers in animal
genome level is possible and cost effective. If
phenotypic information and dense markers of several
generation are put together and analyzed, animal
breeding value can be evaluated without phenotypic
information and only by means of dense markers'
data, such model can evaluate breeding values with
high level of accuracy (more than 85%) by achieving
genomics' segment value and tracking them with the
help of dense markers (Saatchi et al., 2009).
Moreover, it should be considered that evaluated
breeding values on the basis of total genome's data
(GEBV) can be achieved only at birth time, and there
isn't any correlation between traits' heritability and
accuracy of evaluated breeding value by this method
(Kolbehdari et al., 2005).
The important tip is that in dairy bulls, achieving to
such level of accuracy needs 6 years and heavy casts.
Gradually accuracy of breeding value for cattle's will
reach to 80%, therefore, in genomic selection, by
reducing generation distances and increasing
accuracy of genetic evaluation, the possibility of
improving genetic process will provide.
Application of molecular markers' information and
phenotypic records for evaluating BV of each small
segment of genome at first step need complex
calculation and relatively high costs. Nevertheless,
using this method in the process of BV evaluation of
young bulls is cost effective (Sheffer, 2006). Genomic
selection is applying at least in 4 breeding program
across the world; however, there are significant
problems in application of this technology such as
corresponding national evaluation programs with
genomic data, genomic selection among races, the
manner of managing genetic improvement in long
term, consistent control and calculation problems
that can be subjects for further researches.
Materials and methods
Designing required population was done through
accidental simulation and by Microsoft virtual basic
2010 software. In this study, Margom genomes with
100 cm length were simulated. Characteristics of
animals' genomes, markers' dense and number of
QTL were various in under consideration strategies.
In present study, the possibility of evaluation and
animal selection on the basis of dense markers' data
and evaluated breeding values were considered by
means of genomes' computer simulation.
To probe this aim, first basic population is simulated;
the effective number was 1000 animals. Following
that, animals intersected by each other accidentally
during 1000 generations to reach a balance in gene
evaluation by assuming those generations haven't
overlapped. Evaluation in markers had accidental
distribution and its rate was 2.5×10-3, and evolution's
rate in effective allele on quantitative traits was
2.5×10-5. From 1001st generation the population
increased and population structure became similar to
dairy cattle population for next 7 generations. 1001st
5. 77 Honarvar and Nooralvandi
Int. J. Biosci. 2014
generation considered as the population. As Grand –
daughter plan was used in this study, 10 fathers and
100 daughters for each father were simulated.
Moreover, in this generation, BVs were evaluated by
records and markers' data through two methods of
BLUP and Bayesian separately, and the effect of each
marker was evaluated. Observations were on the basis
of daughter yield deviation (DYD) for each of 100
daughters. To evaluate daughter yield deviation
following equation was used:
DYD=0.5BVsire+
BVdam is equal to mother breeding value, MS is equal
to Mendel's sampling effect, and E is the rest effect;
the number of progeny for each daughter and has
normal distribution with zero mean and following
variance:
is total increasing genetic variance, h2 is
heritability, is total phenotypic variance.
The calculated effects which are related to markers
were used for calculating genomic breeding value of
next generation which is called goal generation. The
accuracy level (correlation between evaluated
genomic values and real breeding values) was
calculated and reported for each generation
separately. In addition, correlation between evaluated
genomic breeding values through BLUP and Bayesian
methods was calculated for each generation
separately.
Results
Regarding number of markers (100, 200 and 500),
number of QTLs (4, 10, 20 and 40), the amount of
heritability (5, 10 and 25%) and 2 methods of BLUP
and Bayesian, 72 strategies were considered for all the
possible statues. Repetition number for considering
each strategy was 20 times. Therefore, reported
genomic accuracy is the mean of repetitions.
Table 1. Accuracy amount of genomic selection in goal generations for BLUP and Bayesian methods, number of
different QTLs and number of different markers for 5% heritability.
The accuracy of genomics selectionNumber of
SNP
Number of
QTL
Method
7th
generation
6th
generation
5th
generation
4th
generation
3rd
generation
2nd
generation
1st
generation
0.62870.64610.67410.68660.71190.74100.78991004BLUP
0.69210.70640.72480.74340.76050.79330.81841004Bayesian
0.64970.67160.68930.70580.72060.75420.804310010BLUP
0.69130.70760.72740.74350.76460.78910.813010010Bayesian
0.59580.61010.62280.64830.67400.71140.762710020BLUP
0.67130.69270.71360.72380.74440.77360.800810020Bayesian
0.59410.60920.62720.64510.66690.70950.761210040BLUP
0.64120.65100.67210.69340.71350.74910.783510040Bayesian
0.69720.69440.70660.71990.74610.78440.82792004BLUP
0.72380.73640.74550.76550.78960.82230.84902004Bayesian
0.71070.71390.73320.74930.76880.79430.840020010BLUP
0.74490.74640.75130.76610.78930.81200.842420010Bayesian
0.66450.66840.69560.72110.73940.77260.821620020BLUP
0.71420.72660.74100.76530.78110.81590.841620020Bayesian
0.65840.67400.69850.71600.73680.77330.828420040BLUP
0.73770.74700.76010.77670.79800.82230.845520040Bayesian
0.70190.71350.72640.74840.75970.78680.83895004BLUP
0.75480.76110.76880.78010.80030.82740.85025004Bayesian
0.70240.71770.73680.74750.76490.79500.846150010BLUP
0.75180.75940.77210.79260.81280.83510.853450010Bayesian
0.71420.72920.73530.75050.77950.79680.848050020BLUP
0.74370.74830.76090.78550.80090.82890.856250020Bayesian
0.70860.71620.73260.74760.76160.78620.841550040BLUP
0.77100.77300.79000.80440.81810.84070.858950040Bayesian
6. 78 Honarvar and Nooralvandi
Int. J. Biosci. 2014
Table 2. Accuracy amount of genomic selection in goal generations for BLUP and Bayesian methods, number of
different QTLs and number of different markers for 10% heritability.
The accuracy of genomics selectionNumber of
SNP
Number of
QTL
Method
7th generation6th generation5th generation4th generation3rd generation2nd generation1st generation
0.66510.68050.69530.72270.73760.77440.82671004BLUP
0.70060.72300.73960.75510.77880.81080.84301004Bayesian
0.67890.69980.71290.72500.75380.78770.839810010BLUP
0.73220.74530.76100.77730.79590.82460.857610010Bayesian
0.64920.65240.67190.69750.71980.76370.820610020BLUP
0.72340.73510.74980.75860.78560.81490.842710020Bayesian
0.64790.65570.67000.68350.71680.75190.809110040BLUP
0.67330.69670.71820.72460.75030.78110.809810040Bayesian
0.70670.73330.74310.76490.78160.81460.86282004BLUP
0.76180.77230.78290.79680.81710.84560.87392004Bayesian
0.72110.73560.74750.76480.79110.82160.867820010BLUP
0.77970.79740.80140.81210.83780.86100.882020010Bayesian
0.72150.73590.74370.76150.78250.80910.857720020BLUP
0.77070.77940.79330.80590.82570.85310.885620020Bayesian
0.69830.72030.74030.75400.77820.80190.847720040BLUP
0.77070.78650.79470.79990.82280.84970.873420040Bayesian
0.71170.72310.74050.75820.77970.82060.86915004BLUP
0.77950.79150.80590.81960.83260.85580.87885004Bayesian
0.74400.76220.77730.79060.79940.82450.875850010BLUP
0.81060.82630.83890.85000.86440.88230.903150010Bayesian
0.74400.75320.76130.78450.80650.83630.876250020BLUP
0.78050.79290.81150.82680.85260.87420.891950020Bayesian
0.73490.74580.76230.77180.79170.82030.863750040BLUP
0.80170.80900.82140.83670.85540.87580.896750040Bayesian
Following tables show the accuracy level of genomic
selection in goal generations (1st to 7th generations) in
BLUP and Bayesian methods, number of various
QTLs, number of variances markers, and different
heritability.
Table 3. Accuracy amount of genomic selection in goal generations for BLUP and Bayesian methods, number of
different QTLs and number of different markers for 25% heritability.
The accuracy of genomics selectionNumber of
SNP
Number of
QTL
Method
7th generation6th generation5th generation4th generation3rd generation2nd
generation
1st generation
0.70250.72230.74460.75960.78680.81940.87511004BLUP
0.75480.77030.78880.81270.84530.86530.88921004Bayesian
0.72410.73480.74800.77290.80050.82750.877910010BLUP
0.77910.79440.81080.83550.85680.88080.905910010Bayesian
0.69080.71110.73100.74840.77130.80840.872110020BLUP
0.76390.77470.79260.81260.83330.86200.895610020Bayesian
0.70630.72080.73430.74940.77190.80420.859010040BLUP
0.73130.75390.77300.79300.81700.84650.870710040Bayesian
0.75410.76730.78590.79560.81540.84760.90242004BLUP
0.80600.81950.83300.85240.87180.89200.91842004Bayesian
0.78260.79140.80570.82040.84080.86880.909120010BLUP
0.81900.83150.84650.85890.87430.90120.924720010Bayesian
0.78990.80190.81100.82390.85130.87370.911420020BLUP
0.83030.84680.85630.87140.88650.90350.926120020Bayesian
0.75820.77700.79130.80530.82690.85610.899420040BLUP
0.80490.82720.83060.84580.86930.89260.913620040Bayesian
0.77090.78320.79310.81650.83180.85970.91115004BLUP
0.83830.84380.85980.86700.88930.90720.92995004Bayesian
0.80740.81010.82420.83920.85220.87950.917950010BLUP
0.84760.85400.86930.87760.89280.91210.933050010Bayesian
0.79020.80280.81450.83250.85330.87870.917150020BLUP
0.86150.86620.87370.88120.89780.91740.937650020Bayesian
0.78950.80650.81800.83200.85390.87870.920550040BLUP
0.84870.85780.87080.88240.89590.91590.933550040Bayesian
7. 79 Honarvar and Nooralvandi
Int. J. Biosci. 2014
Discussion
The results in tables 1-3 revealed that in all
consideration strategies the accuracy amount of
genomic selection by Bayesian method is higher than
BLUP method. To consider other existed factors such
as markers' numbers, heritability, QTLs' numbers,
following graphs were drawn separately by both
methods of genomic evaluation. These graphs show
the effect of each aforementioned factor on accuracy
amount of genomic selection in 1st generation of
evaluation.
Graphs show the effect of each aforementioned factor
on the change amount in accuracy of genomic
selection for first generation:
Fig. 1. The effect of number of markers and
heritability on accuracy of genomic selection by BLUP
method.
The effect of markers' numbers, heritability and
QTL's number on accuracy of genomic selection by
BLUP.
By increasing number of markers and heritability the
accuracy of genomic selection increased unlinearly.
The changing amount of genomic selection regarding
number of markers in lower heritability was
extremely higher than higher heritability (Graph 1).
By increasing number of QTLs, accuracy of genomic
selection decreased slightly, for example in 0.25 of
heritability by increasing QTL from 5 to 40, the
accuracy of genomic selection reduced from 0.883 to
0.865.
This reduction amount is higher in lower heritability.
For instances, in 0.05 heritability, as QTL numbers
increased from 5 to 40, the accuracy reduced from
0.792 to 0.767 (Graph 2). By increasing number of
markers, the accuracy of genomic selection increased
unlinearly.
Fig. 2. The effect of heritability and number of QTLs
on accuracy of genomic selection by BLUP method.
By increasing number of QTLs, the accuracy level
decreased slightly. For example, in a condition with
100 markers, as QTLs increased from 5 to 40, the
accuracy decreased from 0.792 to 0.761. As markers
increased, the effect of QTLs' numbers on accuracy of
genomic selection reduced (Graph 3).
Fig. 3. The effect of number of markers and QTLs on
accuracy of genomic selection by BLUP method.
The effect of markers' number, heritability and QTLs'
number on accuracy of genomic selection by
Bayesian.
Fig. 4. The effect of markers' number and heritability
on accuracy of genome selection by Bayesian method.
8. 80 Honarvar and Nooralvandi
Int. J. Biosci. 2014
Totally, changing pattern in Bayesian method was
similar to BLUP, but the amount of genomic
selection's accuracy was a bit higher in Bayesian
method. As markers numbers and heritability
increased, the accuracy of genomic selection
increased unlinearly. The changing amount of
accuracy level regarding markers' number in lower
heritability was extremely higher than higher
heritability (Graph 4). As QTLs increased the
accuracy of genomic selection decreased. This
reduction is higher in lower heritability; as marker's
number increased, the accuracy increased unlinearly.
As marker's number increased, the effect of QTL's
number on accuracy of genomic selection decreased.
Reference
Boichard D, Fritz S, Rossignol MN, Guillaume
F, Colleau JJ, Druet D. 2006. Implementation of
Marker-Assisted Selection: Practical Lessones From
Dairy Cattle. 8th World Congress on Genetics Applied
to Livestock Production, August 13-18, Belo
Horizonte, MG, Brasil.
Ewing B, Green P. 2000. Analysis of expressed
sequence tags indicates 35000 human genes.
National Genetics. 25, 232-234.
http://dx.doi.org/10.1038/76115
Georges M, Nielsen D, Mackinnon M, Mishra
A, Okimoto R. 1995. Mapping quantitative trait loci
controlling milk production dairy cattle by exploting
progeny testing. Genetics. 139, 907-920.
Goddard ME, Hayes BJ. 2007. Genomic selection.
J. Anim. Breed. Genet. 124, 323-330.
http://dx.doi.org/10.1017/S0016672300025179.
Goddard ME. 1991. Mapping genes for quantitative
traits using linkage disequilibrium. Genetics,
Selection and Evolution 23, 131-134.
Goddard ME, Chamberlain AC, Hayes BJ.
2006. Can the same markers be used in multiple
breeds? Proc 8th World Congress on Genetics Applied
to Livestock Production. Belo Horizonte, Brasil.
Kolbehdari D, Gerald JB, Schaeffer LR, Allen
OB. 2005. Power of QTL detection by either fixed or
random models in half-sib designs. Genet. Sel. Evol
37, 601-614.
http://dx.doi.org/10.1051/gse:2005021.
Le Rey P, Haveau J, Elsen JM, Sollier P. 1990.
Evidence for a new major gene influencing meat
quality in pigs. Genet. Res. 55, 33-40.
Meuissen TH. 2009. Accuracy of breeding values of
‘unrelated’ individuals predicted by dense SNP
genotyping. Department of Animal and Aquacultural
Sciences,Norwegian University of Life Sciences, Box
1432, AS, Norwey.
Meuwissen TH, Hayes BJ, Goddard ME. 2001.
Predition of total genetic value using genom-wide
dense marker maps. Genetics. 157, 1819-1829.
Meuwissen THE, Goddard ME. 1996. The use of
marker haplotypes in animal breeding schemes. Gent.
Sel. Evol. 28, 161-176.
http://dx.doi.org/10.1186/1297-9686-28-2-161
Misztal I. 2006. Challenges of application of marker
assisted selection – a review. Animal Science Papers
and Reports Institute of Genetics and Animal
Breeding, Jastrzebiec, Poland 24, 5-10.
Mrode R, Thompson R. 2005. Linear models for
the prediction of animal breeding values: Cabi ..
.
Saatchi M, Miraei-Ashtiani SR, Nejati
Javaremi A, Moradi-Shahrebabak M,
Mehrabany-Yeghaneh H. 2009. The impact of
information quantity and strength of relationship
between training set and validation set on accuracy of
genomic estimated breeding values. African Journal
of Biotechnology 9(4), 438-442 p.
Schaeffer LR. 2006. Strategy for applying genom-
wide selection in dairy cattle. Journal of Animal
Breeding and Genetics. 123, 218-223.
9. 81 Honarvar and Nooralvandi
Int. J. Biosci. 2014
Solberg TR, Sonesson AK, Woolliams JA,
Meuwissen THE. 2008. Genomic selection using
different marker types and densities. Journal of
Animal Science 86, 2447-2454.
http://dx.doi.org/10.2527/jas.2007-0010.