SlideShare a Scribd company logo
1 of 70
Searching for traits in PGR collections
              using Focused Identification of
              Germplasm Strategy (FIGS)


                     Abdallah Bari, Kenneth Street, Michael Mackay, Eddy De Pauw,
                   Dag Endresen, Ahmed Amri, Kumarse Nazari and Ammor Yahiaoui



                                                                          CIAT
                                                             Palmira, Colombia
                                                                14 March 2012



Grain
Research &
Development
Corporation
Content
              • Background
                 – PGR - traits
                 – FIGS - traits
              • Objective
                 – Develop a priori information
                 – Develop best bet subset of accs with traits
              • Datasets
                 – Trait data
                 – Environmental data
              • Methodologies
                 – Data preparation
                 – Modeling techniques
              • Results/Discussion
                 – Sub-setting (accessions/variables)
                 – “Hot spots”
Grain
Research &
              • Conclusion
Development                                                      2
Corporation
ICARDA

              ICARDA’s Worldwide
              presence
                                   International
                                   Center for
                                   Agricultural
                                   Research in the
                                   Dry
                                   Areas
                                   (ICARDA)




Grain
Research &
Development
Corporation
ICARDA

              PGR
              centers
              of origin
              and
              diversity




Grain
Research &
Development
Corporation
PGR contribution
              Traits of importance to agriculture

                  – phenological adaptation (short growth
                    duration),
                  – efficient use of water,
                  – resistance to biotic stresses (diseases
                    and insects),
                  – tolerance to abiotic stresses (such as
                    drought and salinity), and
                  – superior grain quality




Grain
Research &
                                                              plant pre-evaluation
Development
Corporation
PGR Challenges


              • 50 - 60 000 traits (loci)
              • 7 million of accessions
              • 1400 genebanks
                                            Seed samples




Grain
Research &
Development
Corporation
PGR Challenges

              A needle in a hay stack

               PGR users want variation for
               specific traits and a hundred
               germplasm accessions to evaluate.




Grain
Research &
Development                                        7
Corporation
PGR Challenges and Concerns


                • Size of collections
                  – Addressed by Brown et al. 1999


                • Cost in evaluating accessions
                  lacking the desired trait
                  – Addressed by Gollin et al. 2000



Grain
Research &
Development
Corporation
Content
              • Background
                 – PGR traits
                 – FIGS
              • Objective
                 – Develop a priori information
                 – Develop best bet subset of accs with traits
              • Datasets
                 – Trait data
                 – Environmental data
              • Methodologies
                 – Data preparation
                 – Modeling techniques
              • Results/Discussion
                 – Sub-setting (accessions/variables)
                 – “Hot spots”
Grain
Research &
              • Conclusion
Development                                                      9
Corporation
Objective
               FIGS searches genetic resources (data) germplasm collections to detect
               any particular trait-environment patterns/ relationships (as a priori
               information).

               This a priori information is then used to develop predictive models to find
               novel genetic variation of the traits of interest and where it is likely to
               occur the most.


               Quantification                                               Utilization of
               of trait-             A priori            Develop
                                                                            genetic
               environment           information         trait subsets
                                                                            resources
               relationship




Grain
Research &
Development                                                                              10
Corporation
Origin of FIGS approach

              Boron toxicity of wheat and barley – early FIGS examples




                                 Mediterranean Sea




              Wheat landraces from marine origin soils in Mediterranean region provided
Grain
              all the genetic variation needed to produce boron tolerant varieties
Research &
Development                                                                 M.C. Mackay, 1995
Corporation
FIGS approach
               “FIGS applies to plant genetic resources (stored collections)
                the same selection pressure exerted on plants by evolution.”




                   PGR
                                Collection     sampling        core
               (Biodiversity)




                   PGR
                                 sampling         trait         user
               (Biodiversity)




Grain
Research &
Development                                                                    12
Corporation
FIGS approach
               FIGS has helped breeders identify
                 long sought-after plant traits such
                 as resistance to:

                  –   Net blotch (barley),
                  –   Powdery mildew,
                  –   Russian wheat aphid (RWA) and
                  –   Sunn pest.




                                  Braidotti, G.2009. Keys to the gene bank, Biotechnology.
                                  Partners in Research for Development 16-17.
Grain
Research &
Development
Corporation
Sunn pest trait of resistance
              8 landrace accessions from
              Afghanistan and

              2 from Tajikistan identified as resistant
              at juvenile stage

              Now developing mapping populations




Grain
Research &                                                14
Development
Corporation
FIGS approach to Pm
                  16,000 variétés locales de blé
                                                               FIGS applique

                       1,300 sélectionnées

                                                               Phenotyping 40% yielded
                                                                           accessions that were
                        211 accs entre R et IR                             resistant to the
                                                               Genotyping isolates used


                           7 nouveau allèles

                  Au moins 2 ont la spécificité de race nouvelle
                  100 ans de génétique classiques = 7 allèles

              Kaur K; Street K; Mackay M; Yahiaoui N; Keller B (2008). Allele mining and sequence
Grain
Research &
              diversity at the wheat powdery mildew resistance locus Pm3. 11th IWGS, 24-29 Aug.,
Development   Brisbane)
Corporation
Locating new Pm3 alleles
              The distribution of the new seven functional alleles of Pm3
              Out of 96.2% of the total set screened

               Turkey
               Afghanistan
               Iran
               Pakistan and
               Armenia




Grain
Research &                                                                  16
Development
Corporation
The FIGS picture
              Genotypes x Environments x Time1 = Genetic Variation


              Can we use the same evolutionary principles
              in reverse to identify the environments that
              ‘engender’ trait specific genetic variation?

              Environments x Traits x Time = Trait variation
              (ExT)?
                                                  1   plus some selection
Grain
Research &                                                     17
Development
Corporation
Examples of eco-geographic variation of
                       traits linked to environmental influences
   Environment influence              Trait                               Species                 Reference
   Low altitudes, high winter emp.,   Cyanogenesis                        Trifolium repens        Pederson, Fairbrother et al.
       low summer rain, spring                                                                    1996
       cloudiness
   Aridity                            Seed dormancy, early                Annual legumes          Ehrman and Cocks 1996
                                      flowering, high seed to pod ratio


   Soil type                          Tolerance to Boron toxicity         Bread wheat             Mackay (1990)

   Altitude, winter temp, RWA         Russian Wheat Aphid (RWA)           Bread wheat             Bohssini, et al accepted for
   distribution                       resistance                                                      publication 2008

   Temperature, aridity               Drought resistance                  Triticum dicoccoides    Peleg, Fahima et al. 2005

   Altitude                           Glume colour and beak length        Durum wheat             Bechere, Belay et al. 1996

   Climate, soil and water            Heading date, culm length,          Triticum dicoccoides    Beharav and Nevo 2004
       availability                   biomass, grain yield and its
                                      Components

   Precipitation, minimum             Glutenin diversity                  Durum wheat             Vanhintum and Elings 1991
        January
   temperature, altitude.
   temperature, aridity               More efficient RUBISCO              Woody perennials        Galmes et al, 2005
                                      activity
Grain
Research &relations,
   Water
Development
                       temperature    Hordatine accumulation              Barely                 After18 C Mackay
                                                                                                        M.
                                                                                                  Batchu, Zimmermann et al.
         and
Corporation                           (disease defence)                                           2006
FIGS system                                      PGR collections
    User defined
       needs                     Database                         Filters

                                 Type of material


                                 Evaluation data

                                  Collection site

     Interface                  Other information

                                    Size limit            500     1500
                                                           250   750


Grain                       See www.figstraitmine.com
                                                        New Subset
   After M. C Mackay 1995
Research &                                                       19
Development
Corporation
Mining natural variation
              By linking traits, environments (and associated selection pressures)
              with genebank accessions (e.g. landraces and crop relatives) we can
              ‘focus’ in on those accession most likely to possess trait specific
              genetic variation.




                                                                 60
                                                                 50
                                                                 40
                                                      Latitude

                                                                 30
                                                                 20
                                                                 10
                                                                 0
                                                                        0       50               100   150

                                                                                     Longitude




                Environnement             Trait                       FIGS subset


Grain
Research &
Development
Corporation
FIGS approach – summarized
                                    Focused Identification of
                                      Germplasm Strategy



              Environment (E)                                   Trait (T)


               Geo-referencing of                                 Evaluation
                collecting places                               (phenotyping)


                                          Accession
                                             (G)
Grain
Research &
Development
Corporation
                                                                                21
Content
              • Background
                 – PGR traits
                 – FIGS
              • Objective
                 – Develop a priori information
                 – Develop best bet subset of accs with traits
              • Datasets
                 – Trait data
                 – Environmental data
              • Methodologies
                 – Data preparation
                 – Modeling techniques
              • Results/Discussion
                 – Sub-setting (accessions/variables)
                 – “Hot spots”
Grain
Research &
              • Conclusion
Development                                                      22
Corporation
Eco-climate data (X)
               ICARDA eco-climatic database, average:
               annual temperature (front), annual
               precipitation (middle), and winter
               precipitation (back) (De Pauw 2008)



              Climate data (X as independent variables)
              site_code1 prec01    prec02    prec03    prec04    prec05    …..   ari01      ari02      ari03      ari04      ari05
              ETH-S893          25        36        72    154.22    148.88            0.167      0.246      0.439      1.098      1.169
              ETH-S1222         29        44        92    167.46       168            0.223      0.344      0.646      1.354      1.612
              NS_339            44        67    130.43    177.96    185.74            0.351      0.552      0.949      1.457      1.751
              ETH-S1153         36        48        86    140.92    131.94             0.28       0.39      0.609      1.108      1.078
              NS_415            32     46.61     95.42     150.3       157            0.271      0.419      0.732      1.289      1.437
              NS_424         31.94        45        90    143.62       150            0.257       0.38      0.641      1.146      1.272
              ETH64:55          28     38.26        57     97.57        81            0.247      0.344       0.45      0.834      0.662
              NS_525            28        39        57     97.13     80.78            0.248      0.352      0.452      0.836      0.669
              NS_526            27        39        57     97.01     80.77            0.241      0.354      0.455      0.842       0.68
              NS_559            23        40     61.89    129.04       102            0.226      0.397      0.511      1.206      0.998
          .
          .
          .      Source: International Center for Agricultural Research in the Dry Areas (ICARDA)
          .
          .
Grain
Research &
Development
Corporation
Eco-climate data (X)
              Layers used in the stem rust studies:
              •   Precipitation (rainfall)
              •   Maximum temperatures
              •   Minimum temperatures
              + Derived GIS layers such as:
              •   Potential evapotranspiration (water-loss)
              •   Agro-climatic Zone (UNESCO classification)
              •   Moisture/Aridity index
                  (mean values for month and year)




Grain
Research &
Development                                                    24
Corporation
Trait data set (Y)
              Trait data
              (Y as
              dependent
              variable)


                                                                                                                       http://www.news.cornell.edu/
              site_code1   R_state0       R_state1       R_state2       R_state3       R_state4       R_state5       R_state6       R_state7       R_state8       R_state9
              ETH-S893                0              0              0              0              0              0              0              0              1              0
              ETH-S1222               0              0              0              0              0              0              0              0              0              1
              NS_339                  0              0              0              0              0              0              1              0              1              0
              ETH-S1153               0              0              0              0              2              1              3              0              0              0
              NS_415                  0              0              0              0              0              0              1              0              0              0
              NS_424                  0              0              0              1              0              0              0              0              0              0
              ETH64:55                0              0              1              0              0              0              0              0              0              0
              NS_525                  0              0              0              0              0              0              1              0              0              0
              NS_526                  0              1              2              1              2              0              3              0              0              0
              .
              NS_559                  2              5              1              0              0              2              0              0              0              0
              .
              ETH64:53
              .                       0              0              1              0              0              0              0              0              0              0
              .
              .
                  Source: (USDA) National Genetic Resources Program (NGRP) GRIN database

Grain
Research &
Development
Corporation
Searching for stem rust trait of resistance -
              concerns
               Stem rust
               spreading
               to wheat
               production
               areas




                            http://www.news.cornell.edu/




Grain
Research &
Development
Corporation
Stem rust on wheat landraces – trait data




              Green dots indicate collecting sites for resistant wheat
              landraces and red dots collecting sites for susceptible
              landraces.

              USDA GRIN, trait data online:                              Field experiments made in
              http://www.ars-grin.gov/cgi-bin/npgs/html/desc.pl?65049    Minnesota by Don McVey
Grain
Research &
Development
Corporation
                                                                                               27
Content
              • Background
                 – PGR traits
                 – FIGS
              • Objective
                 – Develop a priori information
                 – Develop best bet subset of accs with traits
              • Datasets
                 – Trait data
                 – Environmental data
              • Methodologies
                 – Data preparation
                 – Modeling techniques
              • Results/Discussion
                 – Sub-setting (accessions/variables)
                 – “Hot spots”
Grain
Research &
              • Conclusion
Development                                                      28
Corporation
Data preparation
                                                                            Climate data (X as independent variables)
              Power relationship   ~   2(p) (spread)                        site_code                                   …..                    ari02               …..
                                                                            ETH-S893                                                                   0.246
                                                                            ETH-S1222                                                                  0.344
                                                                            NS_339                                                                     0.552
                                                                            ETH-S1153                                                                  0.390
                                                                            NS_415                                                                     0.419
                                                                            NS_424                                                                     0.380
                                                                            ETH64:55                                                                   0.344
                                                                            NS_525                                                                     0.352
                                                                            NS_526                                                                     0.354
                                                                            NS_559                                                                     0.397




                                                                                                                                         500
                                                                  800




                                                                                                                                         400
                                                                  600




                                                                                                                              Frequecy

                                                                                                                                         300
                                                       Frequecy

                                                                  400




                                                                                                                                         200
                                                                  200




                                                                                                                                         100
                                                                                                                                         0
                                                                  0




                                                                        0          5                   10                15                    -4        -2            0            2              4

                                                                            Aridity or Moisture Index during February                                  Aridity or Moisture Index during February




Grain
Research &
Development                                                                                                                                                                                        29
Corporation
Platform

                                                   Geographical
                      R language
                                                Information System
              (Development of algorithms)
                                                       (GIS)
              >   Data transformation ( )
                                          Arc Gis
              >   Model <- model(trait ~ climate)
              >
                                          Environmental data/layers
                  Measuring accuracy metrics
              >   ….                      (surfaces)




                     Modeling purpose            Generation of
                                                 environmental data

Grain
Research &
Development                                                           30
Corporation
Modeling framework

                                  Trait data (Y)                Environmental data (X)



                                                     Y ~ f(X)


              Fist linear approach irrespective of the underlying distributions describing the data


              Yi ~
                                                                X is the set of variables that contains
                                                                explanatory variables or predictors
                                                                (climate data) where X ∈ Rm,
                                                                Y ∈ Y that is either a categorical (label)
                                                                or a numerical response (trait descriptor
               Yi ~                                             states).

Grain
Research &
Development                                                                                                  31
Corporation
Modeling framework

              •   Principal component analysis (PCA)
              •   Partial Least Square (PLS)
              •   Random Forest (RF)
              •   Support Vector Machines (SVM)
              •   Neural Networks (NN)


                  Bari A., Street K., Mackay M., Endresen D.T.F., De Pauw E. & Amri A.
                  (2011) Focused identification of germplasm strategy (FIGS) detects wheat
                  stem rust resistance linked to environmental variables.
                  Genetic Resources and Crop Evolution
                  http://www.springerlink.com/content/m7140x68v2065113/fulltext.pdf
Grain
Research &
Development
Corporation
Principal Component Analysis (PCA)
                                                           •   Principal component analysis (PCA)
                                                           •   Partial Least Square (PLS)
                                                           •   Random Forest (RF)
                                                           •   Support Vector Machines (SVM)
                                                           •   Neural Networks (NN)

              B a matrix of coefficients.

              The prediction was initially carried out using the number of
              components (PCs) that account for 95% of explained variance.

              Followed by adding a component at a time till the error reached a
              minimum




Grain
Research &
Development                                                                                    33
Corporation
Partial Least Square (PLS)
                                                             •   Principal component analysis (PCA)
                                                             •   Partial Least Square (PLS)
                                                             •   Random Forest (RF)
                                                             •   Support Vector Machines (SVM)
                                                             •   Neural Networks (NN)

              PLS :
              A product of factors and their loadings (regression coefficients) where
              both environmental dataset and trait dataset simultaneously

              The prediction was initially carried out using the number of components
              (PCs) that account for 95% of explained variance.

              Followed by adding a component at a time till the error reached a
              minimum




Grain
Research &
Development                                                                                      34
Corporation
Random Forest (RF)
                                                        •   Principal component analysis (PCA)
                                                        •   Partial Least Square (PLS)
                                                        •   Random Forest (RF)
                                          Data          •   Support Vector Machines (SVM)
                                                        •   Neural Networks (NN)

                              Bootstrapping (with replacement)


                   Training (set)                                       Out-of-bag (set)
                                                                             OOB




              ntree 1           ntree 2            ntree 1000

Grain
Research &
Development                                                                                 35
Corporation
Support Vector Machines (SVM)
                                                          •     Principal component analysis (PCA)
                                                          •     Partial Least Square (PLS)
              SVM a learning-based technique that maps    •     Random Forest (RF)
              input data to a high-dimensional space.     •     Support Vector Machines (SVM)
                                                          •     Neural Networks (NN)
              Optimally separates mapped input into
              respective classes                                                  v
                                                                                  v




                                                                               (x)    v
                                                                             (x)      v     (x)
                                                                            (x)       (x)
              From l-dimensional space (input variable space)
              into k-dimensional space,

              where k is more higher than l.
Grain
Research &
Development                                                                                       36
Corporation
Neural Networks (NN)
                                                          •   Principal component analysis (PCA)
                                                          •   Partial Least Square (PLS)
                   Neural Networks (RBF)
                                                          •   Random Forest (RF)
                                                          •   Support Vector Machines (SVM)
                                                          •   Neural Networks (NN)
                                                  error


                                                                                    Test set
              x1


              x2                           F(x)
                                                                                     Training set




              xp

                                                                          epochs number

Grain
Research &
Development                                                                                    37
Corporation
Optimization/tuning
              error


                                                                    Test set




                                                                      Training set




                                           PCs, LVs or epochs number


              Trend of output error versus the number of components(PCs/LVs) or epochs (NN)
Grain
Research &
Development
Corporation
Accuracy metrics
              Parameters that provide information on the specificity
              (“trait agro-climate”)


               Confusion matrix (2-by-2 contingency table)
                                                        Observed
                                                        Resistant         Susceptible
               Predicted             Resistant          a                 b
                                     Susceptible        c                 d


               Sensitivity a/ (a + c) =
               Specificity d/(b + d) =

                 and       are indicators of the models ability to correctly classify observations.




Grain
Research &
Development
Corporation
Accuracy metrics
               Parameters that provide information on the specificity
               (“trait agro-climate”) ..

               High AUC (area) values indication of potential trait-environment relationship
              1-
                       ROC curve                            pdf’s of trait distribution
               1




                                     1

Grain
                   The ROC curve and the resulting pdf’s of trait distribution (trait states)
Research &
Development
Corporation
Accuracy metrics
               Randomness
              1-   ROC curve       pdf’s of trait distribution
               1




                               1




Grain
Research &
Development
Corporation
Content
              • Background
                 – PGR traits
                 – FIGS
              • Objective
                 – Develop a priori information
                 – Develop best bet subset of accs with traits
              • Datasets
                 – Trait data
                 – Environmental data
              • Methodologies
                 – Data preparation
                 – Modeling techniques
              • Results/Discussion
                 – Sub-setting (accessions/variables)
                 – “Hot spots”
Grain
Research &
              • Conclusion
Development                                                      42
Corporation
Data preparation - Raw data

                                                                                                                                                               PCs = 42




                                                                                                   1.0




                                                                                                                                                       1
                              0.46




                                                                                                                                                       0.71
                                                                                                   0.8
                                                                              True positive rate
                              0.44




                                                                                                   0.6
                    RMSE




                                                                                                                                                       0.13
                                                                                                   0.4
                              0.42




                                                                                                   0.2
                              0.40




                                                                                                                                                       -0.45
                                                                                                   0.0
                                     0    10    20     30     40   50    60                              0.0   0.2      0.4     0.6        0.8   1.0

                                            Number of components                                                     False positive rate



                                            Distribution by trait
                              2.0




                                                                                                               AUC = 0.67
                              1.5
                    Density




                                                                                                               Kappa = 0.40
                              1.0
                              0.5
                              0.0




                                         -0.5    0.0        0.5    1.0




Grain
Research &
Development
Corporation
Data preparation – Transformed data

                                                                                                                                                                    PCs = 42




                                                                                                        1.0
                               0.46




                                                                                                                                                            0.59
                                                                                                        0.8
                                                                                   True positive rate
                               0.44




                                                                                                        0.6
                     RMSE

                               0.42




                                                                                                                                                            0.03
                                                                                                        0.4
                                                                                                        0.2
                               0.40




                                                                                                                                                            -0.54
                                                                                                        0.0
                                      0   10     20         30   40    50     60                              0.0   0.2      0.4     0.6        0.8   1.0

                                            Number of components                                                          False positive rate



                                           Distribution by trait
                               2.0
                               1.5




                                                                                                                    AUC = 0.71
                     Density

                               1.0




                                                                                                                    Kappa = 0.45
                               0.5
                               0.0




                                          -0.5        0.0        0.5    1.0




Grain
Research &
Development
Corporation
Data preparation - Raw data (PLS)

                                                                                                                                                                 LVs = 30




                                                                                                     1.0
                              0.46




                                                                                                                                                         0.68
                                                                                                     0.8
                                                                                True positive rate
                              0.44




                                                                                                     0.6
                    RMSE




                                                                                                                                                         0.07
                                                                                                     0.4
                              0.42




                                                                                                     0.2
                              0.40




                                                                                                                                                         -0.55
                                                                                                     0.0
                                       0    10     20     30    40   50    60                              0.0   0.2      0.4     0.6        0.8   1.0

                                              Number of components                                                     False positive rate



                                             Distribution by trait
                              2.0




                                                                                                                 AUC = 0.70
                              1.5
                    Density




                                                                                                                 Kappa = 0.43
                              1.0
                              0.5
                              0.0




                                     -1.0   -0.5    0.0        0.5   1.0




Grain
Research &
Development
Corporation
Data preparation – Transformed data

                                                                                                                                                                          LVs = 22




                                                                                                                                                               0.6 0.85
                                                                                                           1.0
                              0.46




                                                                                                           0.8
                                                                                      True positive rate
                              0.44




                                                                                                           0.6
                    RMSE

                              0.42




                                                                                                                                                               0.09
                                                                                                           0.4
                                                                                                           0.2
                              0.40




                                                                                                                                                               -0.42
                                                                                                           0.0
                                     0     10      20      30   40    50         60                              0.0   0.2      0.4     0.6        0.8   1.0

                                                Number of components                                                         False positive rate



                                            Distribution by trait
                              2.0




                                                                                                                       AUC = 0.71
                              1.5
                    Density

                              1.0




                                                                                                                       Kappa = 0.44
                              0.5
                              0.0




                                         -0.5        0.0        0.5        1.0




Grain
Research &
Development
Corporation
Optimization process

                                             R_CALC                                                   R_CALC
                       0.46




                                                                                0.46
                       0.44




                                                                                0.44
              RMSEP




                                                                        RMSEP
                       0.42




                                                                                0.42
                       0.40




                                                                                0.40
                              0   10    20     30      40     50   60                  0   10    20     30      40     50   60

                                       number of components                                     number of components




                       Mean square error (RMSEP) for PCA (left) and PLS (right) models. Arrow indicate
                      minimum errors where the number of components (PCs and LVs) were selected for
                           prediction (red/discount nous = test data, continuous line = training set)

Grain
Research &
Development                                                                                                                      47
Corporation
PCA
                       PC2
                       Few components  ~ random
                                                                                                            Distribution per R_CALC
                                   1.0




                                                                                                 12
                                                                                                                               Resistant
                                   0.8




                                                                                                                               Susceptible




                                                                                                 10
              True positive rate

                                   0.6




                                                                                                 8
                                                                                       Density

                                                                                                 6
                                   0.4




                                                                                                 4
                                   0.2




                                                                                                 2
                                   0.0




                                         0.0   0.2    0.4          0.6     0.8   1.0             0
                                                                                                      0.2          0.3          0.4          0.5
                                                     False positive rate
                                                                                                                         ...




Grain
Research &
Development                                                                                                                                        48
Corporation
PCA
          PC5
                                                                                                                   Distribution per R_CALC
                                   1.0




                                                                                                 4
                                                                                                                                                  Resistant
                                                                                                                                                  Susceptible
                                   0.8




                                                                                                 3
              True positive rate

                                   0.6




                                                                                       Density

                                                                                                 2
                                   0.4




                                                                                                 1
                                   0.2
                                   0.0




                                         0.0   0.2    0.4          0.6     0.8   1.0             0
                                                                                                     -0.4   -0.2    0.0   0.2         0.4   0.6      0.8        1.0
                                                     False positive rate
                                                                                                                                ...




Grain
Research &
Development                                                                                                                                                           49
Corporation
PLS
                      LV2
                      2 latent variables of PLS are better than 2 PCs of PCA
                                                                                                             Distribution per R_CALC
                                   1.0




                                                                                           4
                                                                                                                                            Resistant
                                                                                                                                            Susceptible
                                   0.8




                                                                                           3
              True positive rate

                                   0.6




                                                                                 Density

                                                                                           2
                                   0.4




                                                                                           1
                                   0.2
                                   0.0




                                                                                           0




                                         0.0   0.2    0.4          0.6     0.8     1.0         -0.4   -0.2   0.0    0.2         0.4   0.6     0.8         1.0

                                                     False positive rate                                                  ...



Grain
Research &
Development                                                                                                                                                     50
Corporation
PLS
              LV10
                                                                                                              Distribution per R_CALC
                                   1.0




                                                                                                                                        Resistant




                                                                                                 2.0
                                   0.8




                                                                                                                                        Susceptible
              True positive rate

                                   0.6




                                                                                                 1.5
                                                                                       Density
                                   0.4




                                                                                                 1.0
                                   0.2




                                                                                                 0.5
                                   0.0




                                                                                                 0.0



                                         0.0   0.2    0.4          0.6     0.8   1.0

                                                     False positive rate                               -0.5       0.0          0.5         1.0

                                                                                                                        ...


Grain
Research &
Development                                                                                                                                           51
Corporation
PCA (optimized)
                                                                             •   Principal component analysis (PCA)
                                                                             •   Partial Least Square (PLS)
                                                                             •   Random Forest (RF)
                                                                             •   Support Vector Machines (SVM)
                                                                             •   Neural Networks (NN)
                                     ROC curve
                         1.0




                                                                       2.0
    True positive rate


                         0.8




                                                                       1.5
                                                             Density
                         0.6




                                                                       1.0
                         0.4




                                                                       0.5
                         0.2
                         0.0




                                                                       0.0
                               0.0        0.4          0.8                           -0.5    0.0     0.5    1.0


                                 False positive rate                                   Prediction
Grain
Research &
Development
Corporation
PLS (optimized)
                                                                             •   Principal component analysis (PCA)
                                                                             •   Partial Least Square (PLS)
                                                                             •   Random Forest (RF)
                                                                             •   Support Vector Machines (SVM)
                                                                             •   Neural Networks (NN)
                                     ROC curve
                         1.0




                                                                       2.0
    True positive rate


                         0.8




                                                                       1.5
                                                             Density
                         0.6




                                                                       1.0
                         0.4




                                                                       0.5
                         0.2
                         0.0




                                                                       0.0
                               0.0        0.4          0.8                        -0.5     0.0      0.5      1.0


                                 False positive rate                                   Prediction
Grain
Research &
Development
Corporation
RF
                                                                             •   Principal component analysis (PCA)
                                                                             •   Partial Least Square (PLS)
                                                                             •   Random Forest (RF)
                                                                             •   Support Vector Machines (SVM)
                                                                             •   Neural Networks (NN)
                                     ROC curve




                                                                       3.0
                         1.0




                                                                       2.5
    True positive rate


                         0.8




                                                                       2.0
                                                             Density
                         0.6




                                                                       1.5
                         0.4




                                                                       1.0
                         0.2




                                                                       0.5
                         0.0




                                                                       0.0
                               0.0        0.4          0.8                          0.0         0.5        1.0


                                 False positive rate                                   Prediction
Grain
Research &
Development
Corporation
SVM
                                                                           •   Principal component analysis (PCA)
                                                                           •   Partial Least Square (PLS)
                                                                           •   Random Forest (RF)
                                                                           •   Support Vector Machines (SVM)
                                                                           •   Neural Networks (NN)
                                     ROC curve
                         1.0




                                                                       4
    True positive rate


                         0.8




                                                                       3
                                                             Density
                         0.6




                                                                       2
                         0.4




                                                                       1
                         0.2
                         0.0




                                                                       0
                               0.0        0.4          0.8                          0.0      0.5       1.0


                                 False positive rate                                 Prediction
Grain
Research &
Development
Corporation
NN
                                                                             •   Principal component analysis (PCA)
                                                                             •   Partial Least Square (PLS)
                                                                             •   Random Forest (RF)
                                                                             •   Support Vector Machines (SVM)
                                                                             •   Neural Networks (NN)
                                     ROC curve
                         1.0




                                                                       3.0
    True positive rate


                         0.8




                                                                       2.5
                                                             Density


                                                                       2.0
                         0.6




                                                                       1.5
                         0.4




                                                                       1.0
                         0.2




                                                                       0.5
                         0.0




                                                                       0.0
                               0.0        0.4          0.8                       -0.2    0.2         0.6    1.0


                                 False positive rate                                    Prediction
Grain
Research &
Development
Corporation
Random (PCA)
                                                                                                                                                         R_CALC




                                                                                                                                  0.470
                                   1.0




                                                                                                                                                                                      Complete
                                                                                                                                                                                      random
                                   0.8




                                                                                                                                                                                      distribution




                                                                                                                                  0.465
              True positive rate

                                   0.6




                                                                                                                          RMSEP
                                                                                                                                                                                      of trait of
                                   0.4




                                                                                                                                                                                      stem rust




                                                                                                                                  0.460
                                                                                                                                                                                      resistance
                                   0.2




                                                                                                                   AUC ~ 0.5
                                   0.0




                                         0.0          0.2          0.4            0.6          0.8         1.0                            0   10   20       30         40   50   60

                                                                   False positive rate                                                              number of components
                                   1.0




                                                                                                             0.1




                                                                                                                                  0.465
                                                                                               0.2

                                                                                                                                                                                      Partially

                                                                                                                                  0.460
                                   0.8




                                                                               0.3
                                                                                                                                                                                      random
                                                                                                                                  0.455
              True positive rate

                                   0.6




                                                             0.4
                                                                                                                                  0.450                                               distribution
                                                                                                                          RMSE




                                                                                                                                                                                      of trait of
                                   0.4




                                                                                                                                  0.445




                                                       0.5
                                                                                                                                                                                      stem rust
                                   0.2




                                                                                                                                  0.440




                                                                                                                                                                                      resistance
                                                0.6
                                                                                                                                  0.435
                                   0.0




                                               0.8
                                               0.7


                                         0.0           0.2           0.4                 0.6         0.8         1.0

Grain                                                                False positive rate
                                                                                                                                          0   10   20       30         40   50   60

Research &                                                                                                                                          Number of components
Development                                                                                                                                                                                          57
Corporation
Stem rust hot spots
                         60
                         50
                         40
              Latitude

                         30
                         20
                         10
                         0




                              0   50               100   150

                                       Longitude


Grain
Research &
Development
Corporation
Stem rust hot spots
                                                            areas where resistance is
               latitude    60
                           50
                           40
                                                            likely to occur (longitude wise)
                Latitude

                           30




                                                   1
                           20
                           10
                           0




                                0       50                     100                    150
                           60




                                                Longitude




                                    b
                           50
                           40
                Latitude

                           30
                           20
                           10




Grain
                           0




Research &
Development
Corporation
                                0       50
                                             longitude
                                                Longitude
                                                               100                    150
PLS (optimized)
              Areas where resistance is likely to occur (dark red)
                          60




                                                              -0.2


                                                                                                                                                                                               0.8
                                                                                         0
                          50




                                                                                   0.2
                                                                                                                                                                                               0.6




                                                                                                                                   2
                                                                                                                                -0.
                                                                             0.4                                                                              0
                               0.6




                                                                                                                                                 -0.2
                                                                 0.6
               Latitude
                          40




                                                                                             0.2
                                                                                                            0
                                                                                                                                                              0.2
                                                                                                                                                                                               0.4
                                               0.6
                                         0.4
                  Y




                                                                                                                          0.6
                          30




                                                                                                   0                                                                                           0.2


                                                                                                                0.6
                                                                                                                                   0.4
                                                                                                                                           0.2
                          20




                                                                                    0
                                                                                                                      0
                                                                                                                                                                                               0.0
                                                                                                                          0

                                                          0
                                                                               0.2
                                                                       0.4
                                                                                                                                                                                               -0.2
                          10




                                                                       0.4




                                                                                                                                                                                   0.08




                                     0               20                  40                            60                          80    100            120

                                                                                             Longitude
                                                                                                                                                                                   0.06




                                                                                                X




                                                                                                                                                                    semivariance
                                                                                                                                                                                   0.04




                                                                                                                                                                                   0.02




Grain
Research &                                                                                                                                                                                10         20              30   40



Development                                                                                                                                                                                               distance




Corporation
Random Forest (RF)
              Areas where resistance is likely to occur (dark red)
                            60




                                                                                                  0.4
                            50




                                                                                                                                                                                                    0.8
                                                                0.2
                                                 0


                                                     0.4
                                                                                                                                                                     0
                                 0.6
                                                                  0.8                                                                                                                               0.6
                 Latitude
                            40




                                                                                        0.2

                                                                              0.4
                                           0.6                                                                                                           0.2
                                                                        0.6
                                                                                                        0
                                                                                                                  0.4
                                                                                                  0.2
                                                                                0.4


                                                                                            0.2
                    Y




                                                                                                                                   0
                                                                                                                                                                                                    0.4
                            30




                                                                                                                        0.6
                                                                                                                                             0.4
                                                                                                            0.6



                                                                                                                                       0.6

                                                                                                                                                                                                    0.2
                            20




                                                                                                                                                                                                    0.0
                            10




                                                                                      0.2
                                                                        0.4




                                                                                                                                                                                        0.15




                                       0                   20            40                        60                         80                   100         120

                                                                                            Longitude
                                                                                               X                                                                                        0.10




                                                                                                                                                                         semivariance
                                                                                                                                                                                        0.05




Grain
Research &
Development                                                                                                                                                                                    10         20

                                                                                                                                                                                                               distance
                                                                                                                                                                                                                          30   40




Corporation
svm
              Areas where resistance is likely to occur (dark red)
                            60




                                                                                                                                                                         1.0
                            50




                                                                                              0
                                                        0
                                                                                                                                                                         0.8
                                                            0.6                  0.6               0
                                       0.6
                 Latitude
                            40




                                                                                                  0.2
                                 0.4




                                                        0.4
                                                 0.6                       1                                                                                             0.6
                                                                                       0.8                               0
                                                  0.2
                                                                                                  0                                                    0.2




                                                                                                                         0
                                                                           0.8
                    Y




                                                                                                                               0.6
                            30




                                                                                                                   0.2                                                   0.4




                                                                                                                                      0.8
                                                                                                                         0.6
                                                                                                                                                 0.6
                                                                                                                                     0.4
                                                                                                                   0.4

                                                                                                                                                                         0.2
                            20




                                                                       0                                                                           0.4



                                                                                                                                     0.2
                                                                                                                                                                         0.0
                            10




                                                                                  0
                                                                                        0.4             0.2




                                             0                    20                   40                     60                            80               100   120

                                                                                                   Longitude
                                                                                                      X




Grain
Research &
Development
Corporation
Content
              • Background
                 – PGR traits
                 – FIGS
              • Objective
                 – Develop a priori information
                 – Develop best bet subset of accs with traits
              • Datasets
                 – Trait data
                 – Environmental data
              • Methodologies
                 – Data preparation
                 – Modeling techniques
              • Results/Discussion
                 – Sub-setting (accessions/variables)
                 – “Hot spots”
Grain
Research &
              • Conclusion
Development                                                      63
Corporation
Results – stem rust on wheat
               Dataset (unit)        PPV                    LR+                                  Estimated gain
               Stem rust             0.54 (0.50-0.59)       3.07 (2.66-3.54)                     1.95 (1.79-2.09)
               (accession)
               Random                0.29 (0.26-0.33)       1.04 (0.90-1.20)                     1.03 (0.91-1.16)

               (28 % resistant samples)



               Stem rust (site)      0.50 (0.40-0.60)       4.00 (2.85-5.66)                     2.51 (2.02-2.98)

               Random                0.19 (0.13-0.26)       0.94 (0.63-1.39)                     0.95 (0.66-1.33)

               (20 % resistant samples)
                                                         PPV = Positive Predictive Value; LR+ = Positive Diagnostic Likelihood Ratio

               Endresen, D.T.F., K. Street, M. Mackay, A. Bari, E. De Pauw (2011). Predictive
               association between biotic stress traits and ecogeographic data for wheat and barley
Grain
Research &     landraces. Crop Science 51: 2036-2055. DOI: 10.2135/cropsci2010.12.0717
Development
Corporation
                                                                                                                               64
Results – stem rust on wheat
                                                                   AUC = Area Under the ROC Curve (ROC, Receiver Operating Curve)



               Classifier method                                   AUC                              Cohen’s Kappa

               Principal Component Regression                      0.69 (0.68-0.70)                 0.40 (0.37-0.42)
               (PCR)
               Partial Least Squares (PLS)                         0.69 (0.68-0.70)                 0.41 (0.39-0.43)

               Random Forest (RF)                                  0.70 (0.69-0.71)                 0.42 (0.40-0.44)

               Support Vector Machines (SVM)                       0.71 (0.70-0.72)                 0.44 (0.42-0.45)

               Artificial Neural Networks (ANN)                    0.71 (0.70-0.72)                 0.44 (0.42-0.46)

              Bari, A., K. Street, , M. Mackay, D.T.F. Endresen, E. De Pauw, and A. Amri (2011). Focused
              Identification of Germplasm Strategy (FIGS) detects wheat stem rust resistance linked to
Grain
              environment variables. Genetic Resources and Crop Evolution [online first]. doi:10.1007/s10722-
Research &    011-9775-5; Published online 3 Dec 2011.
Development
Corporation
                                                                                                                             65
Results – stem rust on wheat
               Classifier method          PPV                         LR+                       Estimated gain

               kNN (pre-study)            0.29 (0.13-0.53)            5.61 (2.21-14.28)         4.14 (1.86-7.57)
               SIMCA                      0.28 (0.14-0.48)            5.26 (2.51-11.01)         4.00 (2.00-6.86)

               Ensemble classifier        0.33 (0.12-0.65)            8.09 (2.23-29.42)         6.47 (2.05-11.06)
               Random                     0.06 (0.01-0.27)            0.95 (0.13-6.73)          0.97 (0.16-4.35)

               (pre-study, 550 + 275 accessions)
               Ensemble                   0.26 (0.22-0.30)            2.78 (2.34-3.31)          2.32 (2.00-2.68)

               Random                     0.11 (0.09-0.15)            1.02 (0.77-1.36)          0.95 (0.77-1.32)

               (blind study, 825 + 3738 accessions)

                                                PPV = Positive Predictive Value; LR+ = Positive Diagnostic Likelihood Ratio



               Endresen, D.T.F., K. Street, M. Mackay, A. Bari, E. De Pauw, K. Nazari, and A. Yahyaoui
               (2012). Sources of Resistance to Stem Rust (Ug99) in Bread Wheat and Durum Wheat
               Identified Using Focused Identification of Germplasm Strategy (FIGS). Crop Science
Grain          [online first]. doi: 10.2135/cropsci2011.08.0427; Published online 8 Dec 2011.
Research &
Development
Corporation
                                                                                                                       66
Results of stem rust (Ug99) on wheat

              4563 wheat landraces
              screened for Ug99


              10.2 % resistant
              accessions.


              The true trait scores for 20% of the
              accessions (825 samples)


              500 accessions more likely to be resistant from
              3728 accession with true scores hidden


              25.8 % resistant samples and thus 2.3 times
              higher than expected by chance.

Grain
Research &
Development
Corporation
                                                                67
Content
              • Background
                 – PGR traits
                 – FIGS
              • Objective
                 – Develop a priori information
                 – Develop best bet subset of accs with traits
              • Datasets
                 – Trait data
                 – Environmental data
              • Methodologies
                 – Data preparation
                 – Modeling techniques
              • Results/Discussion
                 – Sub-setting (accessions/variables)
                 – “Hot spots”
Grain
Research &
              • Conclusion
Development                                                      68
Corporation
Conclusion ...

              Results
               –   Raw data vs Transformed data
               –   PLS vs PCA
               –   Non-linear vs linear
               –   FIGS vs random (selection)


              Issues
               – Extent of variables (trait/agro-climate)
               – Phenology (adaptation)
               – Fuzzy approach (trait variation capture)




Grain
Research &
Development                                                 69
Corporation
Grain
Research &
Development
Corporation

More Related Content

What's hot

Diversity Array technology
Diversity Array technologyDiversity Array technology
Diversity Array technologyManjesh Saakre
 
Breeding for improved drought tolerance in major crop (Maize, Sorghum, Red gram)
Breeding for improved drought tolerance in major crop (Maize, Sorghum, Red gram)Breeding for improved drought tolerance in major crop (Maize, Sorghum, Red gram)
Breeding for improved drought tolerance in major crop (Maize, Sorghum, Red gram)bidush
 
Marker Assisted Gene Pyramiding for Disease Resistance in Rice
Marker Assisted Gene Pyramiding for Disease Resistance in RiceMarker Assisted Gene Pyramiding for Disease Resistance in Rice
Marker Assisted Gene Pyramiding for Disease Resistance in RiceIndrapratap1
 
Smart breeding final
Smart breeding finalSmart breeding final
Smart breeding finalPavan R
 
Allele mining in orphan underutilized crops
Allele mining in orphan underutilized cropsAllele mining in orphan underutilized crops
Allele mining in orphan underutilized cropsCCS HAU, HISAR
 
MARKER-ASSISTED BREEDING FOR RICE IMPROVEMENT
MARKER-ASSISTED BREEDING FOR RICE IMPROVEMENTMARKER-ASSISTED BREEDING FOR RICE IMPROVEMENT
MARKER-ASSISTED BREEDING FOR RICE IMPROVEMENTFOODCROPS
 
Gene introgression from wild relatives to cultivated plants
Gene introgression from wild relatives to cultivated plantsGene introgression from wild relatives to cultivated plants
Gene introgression from wild relatives to cultivated plantsManjappa Ganiger
 
Allele mining, tilling and eco tilling
Allele mining, tilling and eco tillingAllele mining, tilling and eco tilling
Allele mining, tilling and eco tillingkundan Jadhao
 
Genomic Selection in Plants
Genomic Selection in PlantsGenomic Selection in Plants
Genomic Selection in PlantsPrakash Narayan
 
Genomic Selection & Precision Phenotyping
Genomic Selection & Precision PhenotypingGenomic Selection & Precision Phenotyping
Genomic Selection & Precision PhenotypingCIMMYT
 
Domestication syndrome in crop plants
Domestication syndrome in crop plantsDomestication syndrome in crop plants
Domestication syndrome in crop plantsAnilkumar C
 
Biometrical Techniques for Analysis of Genotype x Environment Interactions & ...
Biometrical Techniques for Analysis of Genotype x Environment Interactions & ...Biometrical Techniques for Analysis of Genotype x Environment Interactions & ...
Biometrical Techniques for Analysis of Genotype x Environment Interactions & ...Manoj Sharma
 
Reverse Breeding
Reverse Breeding Reverse Breeding
Reverse Breeding ICRISAT
 
Genepyramiding for biotic resistance
Genepyramiding for biotic resistanceGenepyramiding for biotic resistance
Genepyramiding for biotic resistanceSenthil Natesan
 
Tilling, Eco- Tilling and MAS for crop improvement
Tilling, Eco- Tilling and MAS for crop improvementTilling, Eco- Tilling and MAS for crop improvement
Tilling, Eco- Tilling and MAS for crop improvementDr. Shobha D. Surbhaiyya
 
Genomic and enabling technologies in maize breeding for enhanced genetic gain...
Genomic and enabling technologies in maize breeding for enhanced genetic gain...Genomic and enabling technologies in maize breeding for enhanced genetic gain...
Genomic and enabling technologies in maize breeding for enhanced genetic gain...CIMMYT
 
Genomic selection, prediction models, GEBV values, genomic selection in plant...
Genomic selection, prediction models, GEBV values, genomic selection in plant...Genomic selection, prediction models, GEBV values, genomic selection in plant...
Genomic selection, prediction models, GEBV values, genomic selection in plant...Mahesh Biradar
 

What's hot (20)

Diversity Array technology
Diversity Array technologyDiversity Array technology
Diversity Array technology
 
Breeding for improved drought tolerance in major crop (Maize, Sorghum, Red gram)
Breeding for improved drought tolerance in major crop (Maize, Sorghum, Red gram)Breeding for improved drought tolerance in major crop (Maize, Sorghum, Red gram)
Breeding for improved drought tolerance in major crop (Maize, Sorghum, Red gram)
 
Marker Assisted Gene Pyramiding for Disease Resistance in Rice
Marker Assisted Gene Pyramiding for Disease Resistance in RiceMarker Assisted Gene Pyramiding for Disease Resistance in Rice
Marker Assisted Gene Pyramiding for Disease Resistance in Rice
 
Genome wide association mapping
Genome wide association mappingGenome wide association mapping
Genome wide association mapping
 
gene stacking in crop plants final
gene stacking in crop plants finalgene stacking in crop plants final
gene stacking in crop plants final
 
Smart breeding final
Smart breeding finalSmart breeding final
Smart breeding final
 
Allele mining in orphan underutilized crops
Allele mining in orphan underutilized cropsAllele mining in orphan underutilized crops
Allele mining in orphan underutilized crops
 
MARKER-ASSISTED BREEDING FOR RICE IMPROVEMENT
MARKER-ASSISTED BREEDING FOR RICE IMPROVEMENTMARKER-ASSISTED BREEDING FOR RICE IMPROVEMENT
MARKER-ASSISTED BREEDING FOR RICE IMPROVEMENT
 
Gene introgression from wild relatives to cultivated plants
Gene introgression from wild relatives to cultivated plantsGene introgression from wild relatives to cultivated plants
Gene introgression from wild relatives to cultivated plants
 
Allele mining, tilling and eco tilling
Allele mining, tilling and eco tillingAllele mining, tilling and eco tilling
Allele mining, tilling and eco tilling
 
Plant Phenomics
Plant PhenomicsPlant Phenomics
Plant Phenomics
 
Genomic Selection in Plants
Genomic Selection in PlantsGenomic Selection in Plants
Genomic Selection in Plants
 
Genomic Selection & Precision Phenotyping
Genomic Selection & Precision PhenotypingGenomic Selection & Precision Phenotyping
Genomic Selection & Precision Phenotyping
 
Domestication syndrome in crop plants
Domestication syndrome in crop plantsDomestication syndrome in crop plants
Domestication syndrome in crop plants
 
Biometrical Techniques for Analysis of Genotype x Environment Interactions & ...
Biometrical Techniques for Analysis of Genotype x Environment Interactions & ...Biometrical Techniques for Analysis of Genotype x Environment Interactions & ...
Biometrical Techniques for Analysis of Genotype x Environment Interactions & ...
 
Reverse Breeding
Reverse Breeding Reverse Breeding
Reverse Breeding
 
Genepyramiding for biotic resistance
Genepyramiding for biotic resistanceGenepyramiding for biotic resistance
Genepyramiding for biotic resistance
 
Tilling, Eco- Tilling and MAS for crop improvement
Tilling, Eco- Tilling and MAS for crop improvementTilling, Eco- Tilling and MAS for crop improvement
Tilling, Eco- Tilling and MAS for crop improvement
 
Genomic and enabling technologies in maize breeding for enhanced genetic gain...
Genomic and enabling technologies in maize breeding for enhanced genetic gain...Genomic and enabling technologies in maize breeding for enhanced genetic gain...
Genomic and enabling technologies in maize breeding for enhanced genetic gain...
 
Genomic selection, prediction models, GEBV values, genomic selection in plant...
Genomic selection, prediction models, GEBV values, genomic selection in plant...Genomic selection, prediction models, GEBV values, genomic selection in plant...
Genomic selection, prediction models, GEBV values, genomic selection in plant...
 

Viewers also liked

Improving the Accuracy of Object Based Supervised Image Classification using ...
Improving the Accuracy of Object Based Supervised Image Classification using ...Improving the Accuracy of Object Based Supervised Image Classification using ...
Improving the Accuracy of Object Based Supervised Image Classification using ...CSCJournals
 
54rtgprjgb m
54rtgprjgb m54rtgprjgb m
54rtgprjgb mayca93
 
Dasar dasar pelaksanaan-pendidikaan
Dasar dasar pelaksanaan-pendidikaanDasar dasar pelaksanaan-pendidikaan
Dasar dasar pelaksanaan-pendidikaanaziz hamdan
 
Ultimate CRA Development Certificate
Ultimate CRA Development CertificateUltimate CRA Development Certificate
Ultimate CRA Development CertificateAnna Mucha
 
Solis, un paseo por la creación de su obra
Solis, un paseo por la creación de su obraSolis, un paseo por la creación de su obra
Solis, un paseo por la creación de su obraYohi Solis
 
Programación segunda fase torneo arfa regional sub 17 2012
Programación segunda fase torneo arfa regional sub   17 2012Programación segunda fase torneo arfa regional sub   17 2012
Programación segunda fase torneo arfa regional sub 17 2012Hector Manuel Guerrero Olmos
 
IS1 Zekiel Schobernd - Letter of Recommendation
IS1 Zekiel Schobernd - Letter of RecommendationIS1 Zekiel Schobernd - Letter of Recommendation
IS1 Zekiel Schobernd - Letter of RecommendationZeke Schobernd
 
Word cloud of DigitalCSWomen
Word cloud of DigitalCSWomenWord cloud of DigitalCSWomen
Word cloud of DigitalCSWomenSarah
 
Random Forests R vs Python by Linda Uruchurtu
Random Forests R vs Python by Linda UruchurtuRandom Forests R vs Python by Linda Uruchurtu
Random Forests R vs Python by Linda UruchurtuPyData
 
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)Parth Khare
 

Viewers also liked (18)

Improving the Accuracy of Object Based Supervised Image Classification using ...
Improving the Accuracy of Object Based Supervised Image Classification using ...Improving the Accuracy of Object Based Supervised Image Classification using ...
Improving the Accuracy of Object Based Supervised Image Classification using ...
 
54rtgprjgb m
54rtgprjgb m54rtgprjgb m
54rtgprjgb m
 
Dasar dasar pelaksanaan-pendidikaan
Dasar dasar pelaksanaan-pendidikaanDasar dasar pelaksanaan-pendidikaan
Dasar dasar pelaksanaan-pendidikaan
 
Gira de Estudios
Gira de EstudiosGira de Estudios
Gira de Estudios
 
Trading StocksSemanal22/02/2013
Trading StocksSemanal22/02/2013Trading StocksSemanal22/02/2013
Trading StocksSemanal22/02/2013
 
Ultimate CRA Development Certificate
Ultimate CRA Development CertificateUltimate CRA Development Certificate
Ultimate CRA Development Certificate
 
Solis, un paseo por la creación de su obra
Solis, un paseo por la creación de su obraSolis, un paseo por la creación de su obra
Solis, un paseo por la creación de su obra
 
Arduino
ArduinoArduino
Arduino
 
Programación segunda fase torneo arfa regional sub 17 2012
Programación segunda fase torneo arfa regional sub   17 2012Programación segunda fase torneo arfa regional sub   17 2012
Programación segunda fase torneo arfa regional sub 17 2012
 
IS1 Zekiel Schobernd - Letter of Recommendation
IS1 Zekiel Schobernd - Letter of RecommendationIS1 Zekiel Schobernd - Letter of Recommendation
IS1 Zekiel Schobernd - Letter of Recommendation
 
Dayanne michea román
Dayanne michea románDayanne michea román
Dayanne michea román
 
Word cloud of DigitalCSWomen
Word cloud of DigitalCSWomenWord cloud of DigitalCSWomen
Word cloud of DigitalCSWomen
 
Bidueiro
BidueiroBidueiro
Bidueiro
 
Rubrica
RubricaRubrica
Rubrica
 
Design hybrids
Design hybridsDesign hybrids
Design hybrids
 
Random Forests R vs Python by Linda Uruchurtu
Random Forests R vs Python by Linda UruchurtuRandom Forests R vs Python by Linda Uruchurtu
Random Forests R vs Python by Linda Uruchurtu
 
Historia de la fesad
Historia de la fesadHistoria de la fesad
Historia de la fesad
 
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
 

Similar to Searching for traits in PGR collections using Focused Identification of Germplasm Strategy

Poster ibp very last version
Poster ibp very last versionPoster ibp very last version
Poster ibp very last versionAfRIGA
 
Ramil Mauleon: IRRI GALAXY: bioinformatics for rice scientists
Ramil Mauleon: IRRI GALAXY: bioinformatics for rice scientistsRamil Mauleon: IRRI GALAXY: bioinformatics for rice scientists
Ramil Mauleon: IRRI GALAXY: bioinformatics for rice scientistsGigaScience, BGI Hong Kong
 
ICRISAT Global Planning Meeting 2019: Research Program – Asia by Dr Pooran Ga...
ICRISAT Global Planning Meeting 2019: Research Program – Asia by Dr Pooran Ga...ICRISAT Global Planning Meeting 2019: Research Program – Asia by Dr Pooran Ga...
ICRISAT Global Planning Meeting 2019: Research Program – Asia by Dr Pooran Ga...ICRISAT
 
Trait data mining at European pre-breeding workshop at Alnarp (25 Nov 2009)
Trait data mining at European pre-breeding workshop at Alnarp (25 Nov 2009)Trait data mining at European pre-breeding workshop at Alnarp (25 Nov 2009)
Trait data mining at European pre-breeding workshop at Alnarp (25 Nov 2009)Dag Endresen
 
TLIII: Tropical Legumes I – Improving Tropical Legume Productivity for Margin...
TLIII: Tropical Legumes I – Improving Tropical Legume Productivity for Margin...TLIII: Tropical Legumes I – Improving Tropical Legume Productivity for Margin...
TLIII: Tropical Legumes I – Improving Tropical Legume Productivity for Margin...CGIAR Generation Challenge Programme
 
Castaneda2009 Modelamiento Distribucion Especies
Castaneda2009 Modelamiento Distribucion EspeciesCastaneda2009 Modelamiento Distribucion Especies
Castaneda2009 Modelamiento Distribucion EspeciesNora P. Castañeda-Álvarez
 
Trait data mining using FIGS, seminar at Copenhagen University (27 May 2009)
Trait data mining using FIGS, seminar at Copenhagen University (27 May 2009)Trait data mining using FIGS, seminar at Copenhagen University (27 May 2009)
Trait data mining using FIGS, seminar at Copenhagen University (27 May 2009)Dag Endresen
 
Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)
Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)
Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)Dag Endresen
 
Dr. malvika dadlani
Dr. malvika dadlaniDr. malvika dadlani
Dr. malvika dadlaniTulika Singh
 
Dr. malvika dadlani
Dr. malvika dadlaniDr. malvika dadlani
Dr. malvika dadlanitulika101
 
Activities of Genetic Resources Department Field Crops Research Institute (FC...
Activities of Genetic Resources Department Field Crops Research Institute (FC...Activities of Genetic Resources Department Field Crops Research Institute (FC...
Activities of Genetic Resources Department Field Crops Research Institute (FC...FAO
 
Advances in Cereal genomics by Kanak Saxena
Advances in Cereal genomics by Kanak SaxenaAdvances in Cereal genomics by Kanak Saxena
Advances in Cereal genomics by Kanak SaxenaDr. Kanak Saxena
 
Advances in cereal genomics by Kanak Saxena
Advances in cereal genomics by Kanak SaxenaAdvances in cereal genomics by Kanak Saxena
Advances in cereal genomics by Kanak SaxenaDr. Kanak Saxena
 
Crop wild relative utilization in plant breeding
Crop wild relative utilization in plant breedingCrop wild relative utilization in plant breeding
Crop wild relative utilization in plant breedingAbdul GHAFOOR
 
Programme report-Global System and CWR
Programme report-Global System and CWRProgramme report-Global System and CWR
Programme report-Global System and CWRAnameen
 

Similar to Searching for traits in PGR collections using Focused Identification of Germplasm Strategy (20)

Poster ibp very last version
Poster ibp very last versionPoster ibp very last version
Poster ibp very last version
 
Ramil Mauleon: IRRI GALAXY: bioinformatics for rice scientists
Ramil Mauleon: IRRI GALAXY: bioinformatics for rice scientistsRamil Mauleon: IRRI GALAXY: bioinformatics for rice scientists
Ramil Mauleon: IRRI GALAXY: bioinformatics for rice scientists
 
ICRISAT Global Planning Meeting 2019: Research Program – Asia by Dr Pooran Ga...
ICRISAT Global Planning Meeting 2019: Research Program – Asia by Dr Pooran Ga...ICRISAT Global Planning Meeting 2019: Research Program – Asia by Dr Pooran Ga...
ICRISAT Global Planning Meeting 2019: Research Program – Asia by Dr Pooran Ga...
 
Castaneda2013 capfitogen
Castaneda2013 capfitogenCastaneda2013 capfitogen
Castaneda2013 capfitogen
 
Análisis de vacíos en parientes silvestres
Análisis de vacíos en parientes silvestresAnálisis de vacíos en parientes silvestres
Análisis de vacíos en parientes silvestres
 
Trait data mining at European pre-breeding workshop at Alnarp (25 Nov 2009)
Trait data mining at European pre-breeding workshop at Alnarp (25 Nov 2009)Trait data mining at European pre-breeding workshop at Alnarp (25 Nov 2009)
Trait data mining at European pre-breeding workshop at Alnarp (25 Nov 2009)
 
Harnessing interdisciplinary approaches for germplasm development
Harnessing interdisciplinary approaches for germplasm developmentHarnessing interdisciplinary approaches for germplasm development
Harnessing interdisciplinary approaches for germplasm development
 
TLIII: Tropical Legumes I – Improving Tropical Legume Productivity for Margin...
TLIII: Tropical Legumes I – Improving Tropical Legume Productivity for Margin...TLIII: Tropical Legumes I – Improving Tropical Legume Productivity for Margin...
TLIII: Tropical Legumes I – Improving Tropical Legume Productivity for Margin...
 
Castaneda2009 Modelamiento Distribucion Especies
Castaneda2009 Modelamiento Distribucion EspeciesCastaneda2009 Modelamiento Distribucion Especies
Castaneda2009 Modelamiento Distribucion Especies
 
Trait data mining using FIGS, seminar at Copenhagen University (27 May 2009)
Trait data mining using FIGS, seminar at Copenhagen University (27 May 2009)Trait data mining using FIGS, seminar at Copenhagen University (27 May 2009)
Trait data mining using FIGS, seminar at Copenhagen University (27 May 2009)
 
Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)
Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)
Trait data mining seminar at the Carlsberg research institute (CRI) (4 Nov 2009)
 
Dr. malvika dadlani
Dr. malvika dadlaniDr. malvika dadlani
Dr. malvika dadlani
 
Dr. malvika dadlani
Dr. malvika dadlaniDr. malvika dadlani
Dr. malvika dadlani
 
Activities of Genetic Resources Department Field Crops Research Institute (FC...
Activities of Genetic Resources Department Field Crops Research Institute (FC...Activities of Genetic Resources Department Field Crops Research Institute (FC...
Activities of Genetic Resources Department Field Crops Research Institute (FC...
 
IGIDR-IFPRI - Adaptation Strategies and Policies for Climate Smart Agricultur...
IGIDR-IFPRI - Adaptation Strategies and Policies for Climate Smart Agricultur...IGIDR-IFPRI - Adaptation Strategies and Policies for Climate Smart Agricultur...
IGIDR-IFPRI - Adaptation Strategies and Policies for Climate Smart Agricultur...
 
Advances in Cereal genomics by Kanak Saxena
Advances in Cereal genomics by Kanak SaxenaAdvances in Cereal genomics by Kanak Saxena
Advances in Cereal genomics by Kanak Saxena
 
Advances in cereal genomics by Kanak Saxena
Advances in cereal genomics by Kanak SaxenaAdvances in cereal genomics by Kanak Saxena
Advances in cereal genomics by Kanak Saxena
 
Crop wild relative utilization in plant breeding
Crop wild relative utilization in plant breedingCrop wild relative utilization in plant breeding
Crop wild relative utilization in plant breeding
 
Developing sound climate-smart strategies based on zoom-ins
Developing sound climate-smart strategies based on zoom-insDeveloping sound climate-smart strategies based on zoom-ins
Developing sound climate-smart strategies based on zoom-ins
 
Programme report-Global System and CWR
Programme report-Global System and CWRProgramme report-Global System and CWR
Programme report-Global System and CWR
 

More from CIAT

Agricultura Sostenible y Cambio Climático
Agricultura Sostenible y Cambio ClimáticoAgricultura Sostenible y Cambio Climático
Agricultura Sostenible y Cambio ClimáticoCIAT
 
Resumen mesas trabajo
Resumen mesas trabajoResumen mesas trabajo
Resumen mesas trabajoCIAT
 
Impacto de las intervenciones agricolas y de salud para reducir la deficienci...
Impacto de las intervenciones agricolas y de salud para reducir la deficienci...Impacto de las intervenciones agricolas y de salud para reducir la deficienci...
Impacto de las intervenciones agricolas y de salud para reducir la deficienci...CIAT
 
Agricultura sensible a la nutrición en el Altiplano. Explorando las perspecti...
Agricultura sensible a la nutrición en el Altiplano. Explorando las perspecti...Agricultura sensible a la nutrición en el Altiplano. Explorando las perspecti...
Agricultura sensible a la nutrición en el Altiplano. Explorando las perspecti...CIAT
 
El rol de los padres en la nutrición del hogar
El rol de los padres en la nutrición del hogarEl rol de los padres en la nutrición del hogar
El rol de los padres en la nutrición del hogarCIAT
 
Scaling up soil carbon enhancement contributing to mitigate climate change
Scaling up soil carbon enhancement contributing to mitigate climate changeScaling up soil carbon enhancement contributing to mitigate climate change
Scaling up soil carbon enhancement contributing to mitigate climate changeCIAT
 
Impacto del Cambio Climático en la Agricultura de República Dominicana
Impacto del Cambio Climático en la Agricultura de República DominicanaImpacto del Cambio Climático en la Agricultura de República Dominicana
Impacto del Cambio Climático en la Agricultura de República DominicanaCIAT
 
BioTerra: Nuevo sistema de monitoreo de la biodiversidad en desarrollo por el...
BioTerra: Nuevo sistema de monitoreo de la biodiversidad en desarrollo por el...BioTerra: Nuevo sistema de monitoreo de la biodiversidad en desarrollo por el...
BioTerra: Nuevo sistema de monitoreo de la biodiversidad en desarrollo por el...CIAT
 
Investigaciones sobre Cadmio en el Cacao Colombiano
Investigaciones sobre Cadmio en el Cacao ColombianoInvestigaciones sobre Cadmio en el Cacao Colombiano
Investigaciones sobre Cadmio en el Cacao ColombianoCIAT
 
Cacao for Peace Activities for Tackling the Cadmium in Cacao Issue in Colo...
Cacao for Peace Activities for Tackling the Cadmium in Cacao Issue    in Colo...Cacao for Peace Activities for Tackling the Cadmium in Cacao Issue    in Colo...
Cacao for Peace Activities for Tackling the Cadmium in Cacao Issue in Colo...CIAT
 
Tackling cadmium in cacao and derived products – from farm to fork
Tackling cadmium in cacao and derived products – from farm to forkTackling cadmium in cacao and derived products – from farm to fork
Tackling cadmium in cacao and derived products – from farm to forkCIAT
 
Cadmium bioaccumulation and gastric bioaccessibility in cacao: A field study ...
Cadmium bioaccumulation and gastric bioaccessibility in cacao: A field study ...Cadmium bioaccumulation and gastric bioaccessibility in cacao: A field study ...
Cadmium bioaccumulation and gastric bioaccessibility in cacao: A field study ...CIAT
 
Geographical Information System Mapping for Optimized Cacao Production in Col...
Geographical Information System Mapping for Optimized Cacao Production in Col...Geographical Information System Mapping for Optimized Cacao Production in Col...
Geographical Information System Mapping for Optimized Cacao Production in Col...CIAT
 
Contenido de cadmio en granos de cacao
Contenido de cadmio en granos de cacaoContenido de cadmio en granos de cacao
Contenido de cadmio en granos de cacaoCIAT
 
Técnicas para disminuir la disponibilidad de cadmio en suelos de cacaoteras
Técnicas para disminuir la disponibilidad de cadmio en suelos de cacaoterasTécnicas para disminuir la disponibilidad de cadmio en suelos de cacaoteras
Técnicas para disminuir la disponibilidad de cadmio en suelos de cacaoterasCIAT
 
Cacao and Cadmium Research at Penn State
Cacao and Cadmium Research at Penn StateCacao and Cadmium Research at Penn State
Cacao and Cadmium Research at Penn StateCIAT
 
Aportes para el manejo de Cd en cacao
Aportes para el manejo de Cd en cacaoAportes para el manejo de Cd en cacao
Aportes para el manejo de Cd en cacaoCIAT
 
CENTRO DE INNOVACIÓN DEL CACAO PERÚ
CENTRO DE INNOVACIÓN DEL CACAO PERÚCENTRO DE INNOVACIÓN DEL CACAO PERÚ
CENTRO DE INNOVACIÓN DEL CACAO PERÚCIAT
 
Investigaciones sore Cadmio en el Cacao Colombiano
Investigaciones sore Cadmio en el Cacao ColombianoInvestigaciones sore Cadmio en el Cacao Colombiano
Investigaciones sore Cadmio en el Cacao ColombianoCIAT
 
Avances de investigación en cd en cacao
Avances de investigación en cd en cacaoAvances de investigación en cd en cacao
Avances de investigación en cd en cacaoCIAT
 

More from CIAT (20)

Agricultura Sostenible y Cambio Climático
Agricultura Sostenible y Cambio ClimáticoAgricultura Sostenible y Cambio Climático
Agricultura Sostenible y Cambio Climático
 
Resumen mesas trabajo
Resumen mesas trabajoResumen mesas trabajo
Resumen mesas trabajo
 
Impacto de las intervenciones agricolas y de salud para reducir la deficienci...
Impacto de las intervenciones agricolas y de salud para reducir la deficienci...Impacto de las intervenciones agricolas y de salud para reducir la deficienci...
Impacto de las intervenciones agricolas y de salud para reducir la deficienci...
 
Agricultura sensible a la nutrición en el Altiplano. Explorando las perspecti...
Agricultura sensible a la nutrición en el Altiplano. Explorando las perspecti...Agricultura sensible a la nutrición en el Altiplano. Explorando las perspecti...
Agricultura sensible a la nutrición en el Altiplano. Explorando las perspecti...
 
El rol de los padres en la nutrición del hogar
El rol de los padres en la nutrición del hogarEl rol de los padres en la nutrición del hogar
El rol de los padres en la nutrición del hogar
 
Scaling up soil carbon enhancement contributing to mitigate climate change
Scaling up soil carbon enhancement contributing to mitigate climate changeScaling up soil carbon enhancement contributing to mitigate climate change
Scaling up soil carbon enhancement contributing to mitigate climate change
 
Impacto del Cambio Climático en la Agricultura de República Dominicana
Impacto del Cambio Climático en la Agricultura de República DominicanaImpacto del Cambio Climático en la Agricultura de República Dominicana
Impacto del Cambio Climático en la Agricultura de República Dominicana
 
BioTerra: Nuevo sistema de monitoreo de la biodiversidad en desarrollo por el...
BioTerra: Nuevo sistema de monitoreo de la biodiversidad en desarrollo por el...BioTerra: Nuevo sistema de monitoreo de la biodiversidad en desarrollo por el...
BioTerra: Nuevo sistema de monitoreo de la biodiversidad en desarrollo por el...
 
Investigaciones sobre Cadmio en el Cacao Colombiano
Investigaciones sobre Cadmio en el Cacao ColombianoInvestigaciones sobre Cadmio en el Cacao Colombiano
Investigaciones sobre Cadmio en el Cacao Colombiano
 
Cacao for Peace Activities for Tackling the Cadmium in Cacao Issue in Colo...
Cacao for Peace Activities for Tackling the Cadmium in Cacao Issue    in Colo...Cacao for Peace Activities for Tackling the Cadmium in Cacao Issue    in Colo...
Cacao for Peace Activities for Tackling the Cadmium in Cacao Issue in Colo...
 
Tackling cadmium in cacao and derived products – from farm to fork
Tackling cadmium in cacao and derived products – from farm to forkTackling cadmium in cacao and derived products – from farm to fork
Tackling cadmium in cacao and derived products – from farm to fork
 
Cadmium bioaccumulation and gastric bioaccessibility in cacao: A field study ...
Cadmium bioaccumulation and gastric bioaccessibility in cacao: A field study ...Cadmium bioaccumulation and gastric bioaccessibility in cacao: A field study ...
Cadmium bioaccumulation and gastric bioaccessibility in cacao: A field study ...
 
Geographical Information System Mapping for Optimized Cacao Production in Col...
Geographical Information System Mapping for Optimized Cacao Production in Col...Geographical Information System Mapping for Optimized Cacao Production in Col...
Geographical Information System Mapping for Optimized Cacao Production in Col...
 
Contenido de cadmio en granos de cacao
Contenido de cadmio en granos de cacaoContenido de cadmio en granos de cacao
Contenido de cadmio en granos de cacao
 
Técnicas para disminuir la disponibilidad de cadmio en suelos de cacaoteras
Técnicas para disminuir la disponibilidad de cadmio en suelos de cacaoterasTécnicas para disminuir la disponibilidad de cadmio en suelos de cacaoteras
Técnicas para disminuir la disponibilidad de cadmio en suelos de cacaoteras
 
Cacao and Cadmium Research at Penn State
Cacao and Cadmium Research at Penn StateCacao and Cadmium Research at Penn State
Cacao and Cadmium Research at Penn State
 
Aportes para el manejo de Cd en cacao
Aportes para el manejo de Cd en cacaoAportes para el manejo de Cd en cacao
Aportes para el manejo de Cd en cacao
 
CENTRO DE INNOVACIÓN DEL CACAO PERÚ
CENTRO DE INNOVACIÓN DEL CACAO PERÚCENTRO DE INNOVACIÓN DEL CACAO PERÚ
CENTRO DE INNOVACIÓN DEL CACAO PERÚ
 
Investigaciones sore Cadmio en el Cacao Colombiano
Investigaciones sore Cadmio en el Cacao ColombianoInvestigaciones sore Cadmio en el Cacao Colombiano
Investigaciones sore Cadmio en el Cacao Colombiano
 
Avances de investigación en cd en cacao
Avances de investigación en cd en cacaoAvances de investigación en cd en cacao
Avances de investigación en cd en cacao
 

Recently uploaded

Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 

Recently uploaded (20)

Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 

Searching for traits in PGR collections using Focused Identification of Germplasm Strategy

  • 1. Searching for traits in PGR collections using Focused Identification of Germplasm Strategy (FIGS) Abdallah Bari, Kenneth Street, Michael Mackay, Eddy De Pauw, Dag Endresen, Ahmed Amri, Kumarse Nazari and Ammor Yahiaoui CIAT Palmira, Colombia 14 March 2012 Grain Research & Development Corporation
  • 2. Content • Background – PGR - traits – FIGS - traits • Objective – Develop a priori information – Develop best bet subset of accs with traits • Datasets – Trait data – Environmental data • Methodologies – Data preparation – Modeling techniques • Results/Discussion – Sub-setting (accessions/variables) – “Hot spots” Grain Research & • Conclusion Development 2 Corporation
  • 3. ICARDA ICARDA’s Worldwide presence International Center for Agricultural Research in the Dry Areas (ICARDA) Grain Research & Development Corporation
  • 4. ICARDA PGR centers of origin and diversity Grain Research & Development Corporation
  • 5. PGR contribution Traits of importance to agriculture – phenological adaptation (short growth duration), – efficient use of water, – resistance to biotic stresses (diseases and insects), – tolerance to abiotic stresses (such as drought and salinity), and – superior grain quality Grain Research & plant pre-evaluation Development Corporation
  • 6. PGR Challenges • 50 - 60 000 traits (loci) • 7 million of accessions • 1400 genebanks Seed samples Grain Research & Development Corporation
  • 7. PGR Challenges A needle in a hay stack PGR users want variation for specific traits and a hundred germplasm accessions to evaluate. Grain Research & Development 7 Corporation
  • 8. PGR Challenges and Concerns • Size of collections – Addressed by Brown et al. 1999 • Cost in evaluating accessions lacking the desired trait – Addressed by Gollin et al. 2000 Grain Research & Development Corporation
  • 9. Content • Background – PGR traits – FIGS • Objective – Develop a priori information – Develop best bet subset of accs with traits • Datasets – Trait data – Environmental data • Methodologies – Data preparation – Modeling techniques • Results/Discussion – Sub-setting (accessions/variables) – “Hot spots” Grain Research & • Conclusion Development 9 Corporation
  • 10. Objective FIGS searches genetic resources (data) germplasm collections to detect any particular trait-environment patterns/ relationships (as a priori information). This a priori information is then used to develop predictive models to find novel genetic variation of the traits of interest and where it is likely to occur the most. Quantification Utilization of of trait- A priori Develop genetic environment information trait subsets resources relationship Grain Research & Development 10 Corporation
  • 11. Origin of FIGS approach Boron toxicity of wheat and barley – early FIGS examples Mediterranean Sea Wheat landraces from marine origin soils in Mediterranean region provided Grain all the genetic variation needed to produce boron tolerant varieties Research & Development M.C. Mackay, 1995 Corporation
  • 12. FIGS approach “FIGS applies to plant genetic resources (stored collections) the same selection pressure exerted on plants by evolution.” PGR Collection sampling core (Biodiversity) PGR sampling trait user (Biodiversity) Grain Research & Development 12 Corporation
  • 13. FIGS approach FIGS has helped breeders identify long sought-after plant traits such as resistance to: – Net blotch (barley), – Powdery mildew, – Russian wheat aphid (RWA) and – Sunn pest. Braidotti, G.2009. Keys to the gene bank, Biotechnology. Partners in Research for Development 16-17. Grain Research & Development Corporation
  • 14. Sunn pest trait of resistance 8 landrace accessions from Afghanistan and 2 from Tajikistan identified as resistant at juvenile stage Now developing mapping populations Grain Research & 14 Development Corporation
  • 15. FIGS approach to Pm 16,000 variétés locales de blé FIGS applique 1,300 sélectionnées Phenotyping 40% yielded accessions that were 211 accs entre R et IR resistant to the Genotyping isolates used 7 nouveau allèles Au moins 2 ont la spécificité de race nouvelle 100 ans de génétique classiques = 7 allèles Kaur K; Street K; Mackay M; Yahiaoui N; Keller B (2008). Allele mining and sequence Grain Research & diversity at the wheat powdery mildew resistance locus Pm3. 11th IWGS, 24-29 Aug., Development Brisbane) Corporation
  • 16. Locating new Pm3 alleles The distribution of the new seven functional alleles of Pm3 Out of 96.2% of the total set screened Turkey Afghanistan Iran Pakistan and Armenia Grain Research & 16 Development Corporation
  • 17. The FIGS picture Genotypes x Environments x Time1 = Genetic Variation Can we use the same evolutionary principles in reverse to identify the environments that ‘engender’ trait specific genetic variation? Environments x Traits x Time = Trait variation (ExT)? 1 plus some selection Grain Research & 17 Development Corporation
  • 18. Examples of eco-geographic variation of traits linked to environmental influences Environment influence Trait Species Reference Low altitudes, high winter emp., Cyanogenesis Trifolium repens Pederson, Fairbrother et al. low summer rain, spring 1996 cloudiness Aridity Seed dormancy, early Annual legumes Ehrman and Cocks 1996 flowering, high seed to pod ratio Soil type Tolerance to Boron toxicity Bread wheat Mackay (1990) Altitude, winter temp, RWA Russian Wheat Aphid (RWA) Bread wheat Bohssini, et al accepted for distribution resistance publication 2008 Temperature, aridity Drought resistance Triticum dicoccoides Peleg, Fahima et al. 2005 Altitude Glume colour and beak length Durum wheat Bechere, Belay et al. 1996 Climate, soil and water Heading date, culm length, Triticum dicoccoides Beharav and Nevo 2004 availability biomass, grain yield and its Components Precipitation, minimum Glutenin diversity Durum wheat Vanhintum and Elings 1991 January temperature, altitude. temperature, aridity More efficient RUBISCO Woody perennials Galmes et al, 2005 activity Grain Research &relations, Water Development temperature Hordatine accumulation Barely After18 C Mackay M. Batchu, Zimmermann et al. and Corporation (disease defence) 2006
  • 19. FIGS system PGR collections User defined needs Database Filters Type of material Evaluation data Collection site Interface Other information Size limit 500 1500 250 750 Grain See www.figstraitmine.com New Subset After M. C Mackay 1995 Research & 19 Development Corporation
  • 20. Mining natural variation By linking traits, environments (and associated selection pressures) with genebank accessions (e.g. landraces and crop relatives) we can ‘focus’ in on those accession most likely to possess trait specific genetic variation. 60 50 40 Latitude 30 20 10 0 0 50 100 150 Longitude Environnement Trait FIGS subset Grain Research & Development Corporation
  • 21. FIGS approach – summarized Focused Identification of Germplasm Strategy Environment (E) Trait (T) Geo-referencing of Evaluation collecting places (phenotyping) Accession (G) Grain Research & Development Corporation 21
  • 22. Content • Background – PGR traits – FIGS • Objective – Develop a priori information – Develop best bet subset of accs with traits • Datasets – Trait data – Environmental data • Methodologies – Data preparation – Modeling techniques • Results/Discussion – Sub-setting (accessions/variables) – “Hot spots” Grain Research & • Conclusion Development 22 Corporation
  • 23. Eco-climate data (X) ICARDA eco-climatic database, average: annual temperature (front), annual precipitation (middle), and winter precipitation (back) (De Pauw 2008) Climate data (X as independent variables) site_code1 prec01 prec02 prec03 prec04 prec05 ….. ari01 ari02 ari03 ari04 ari05 ETH-S893 25 36 72 154.22 148.88 0.167 0.246 0.439 1.098 1.169 ETH-S1222 29 44 92 167.46 168 0.223 0.344 0.646 1.354 1.612 NS_339 44 67 130.43 177.96 185.74 0.351 0.552 0.949 1.457 1.751 ETH-S1153 36 48 86 140.92 131.94 0.28 0.39 0.609 1.108 1.078 NS_415 32 46.61 95.42 150.3 157 0.271 0.419 0.732 1.289 1.437 NS_424 31.94 45 90 143.62 150 0.257 0.38 0.641 1.146 1.272 ETH64:55 28 38.26 57 97.57 81 0.247 0.344 0.45 0.834 0.662 NS_525 28 39 57 97.13 80.78 0.248 0.352 0.452 0.836 0.669 NS_526 27 39 57 97.01 80.77 0.241 0.354 0.455 0.842 0.68 NS_559 23 40 61.89 129.04 102 0.226 0.397 0.511 1.206 0.998 . . . Source: International Center for Agricultural Research in the Dry Areas (ICARDA) . . Grain Research & Development Corporation
  • 24. Eco-climate data (X) Layers used in the stem rust studies: • Precipitation (rainfall) • Maximum temperatures • Minimum temperatures + Derived GIS layers such as: • Potential evapotranspiration (water-loss) • Agro-climatic Zone (UNESCO classification) • Moisture/Aridity index (mean values for month and year) Grain Research & Development 24 Corporation
  • 25. Trait data set (Y) Trait data (Y as dependent variable) http://www.news.cornell.edu/ site_code1 R_state0 R_state1 R_state2 R_state3 R_state4 R_state5 R_state6 R_state7 R_state8 R_state9 ETH-S893 0 0 0 0 0 0 0 0 1 0 ETH-S1222 0 0 0 0 0 0 0 0 0 1 NS_339 0 0 0 0 0 0 1 0 1 0 ETH-S1153 0 0 0 0 2 1 3 0 0 0 NS_415 0 0 0 0 0 0 1 0 0 0 NS_424 0 0 0 1 0 0 0 0 0 0 ETH64:55 0 0 1 0 0 0 0 0 0 0 NS_525 0 0 0 0 0 0 1 0 0 0 NS_526 0 1 2 1 2 0 3 0 0 0 . NS_559 2 5 1 0 0 2 0 0 0 0 . ETH64:53 . 0 0 1 0 0 0 0 0 0 0 . . Source: (USDA) National Genetic Resources Program (NGRP) GRIN database Grain Research & Development Corporation
  • 26. Searching for stem rust trait of resistance - concerns Stem rust spreading to wheat production areas http://www.news.cornell.edu/ Grain Research & Development Corporation
  • 27. Stem rust on wheat landraces – trait data Green dots indicate collecting sites for resistant wheat landraces and red dots collecting sites for susceptible landraces. USDA GRIN, trait data online: Field experiments made in http://www.ars-grin.gov/cgi-bin/npgs/html/desc.pl?65049 Minnesota by Don McVey Grain Research & Development Corporation 27
  • 28. Content • Background – PGR traits – FIGS • Objective – Develop a priori information – Develop best bet subset of accs with traits • Datasets – Trait data – Environmental data • Methodologies – Data preparation – Modeling techniques • Results/Discussion – Sub-setting (accessions/variables) – “Hot spots” Grain Research & • Conclusion Development 28 Corporation
  • 29. Data preparation Climate data (X as independent variables) Power relationship ~ 2(p) (spread) site_code ….. ari02 ….. ETH-S893 0.246 ETH-S1222 0.344 NS_339 0.552 ETH-S1153 0.390 NS_415 0.419 NS_424 0.380 ETH64:55 0.344 NS_525 0.352 NS_526 0.354 NS_559 0.397 500 800 400 600 Frequecy 300 Frequecy 400 200 200 100 0 0 0 5 10 15 -4 -2 0 2 4 Aridity or Moisture Index during February Aridity or Moisture Index during February Grain Research & Development 29 Corporation
  • 30. Platform Geographical R language Information System (Development of algorithms) (GIS) > Data transformation ( ) Arc Gis > Model <- model(trait ~ climate) > Environmental data/layers Measuring accuracy metrics > …. (surfaces) Modeling purpose Generation of environmental data Grain Research & Development 30 Corporation
  • 31. Modeling framework Trait data (Y) Environmental data (X) Y ~ f(X) Fist linear approach irrespective of the underlying distributions describing the data Yi ~ X is the set of variables that contains explanatory variables or predictors (climate data) where X ∈ Rm, Y ∈ Y that is either a categorical (label) or a numerical response (trait descriptor Yi ~ states). Grain Research & Development 31 Corporation
  • 32. Modeling framework • Principal component analysis (PCA) • Partial Least Square (PLS) • Random Forest (RF) • Support Vector Machines (SVM) • Neural Networks (NN) Bari A., Street K., Mackay M., Endresen D.T.F., De Pauw E. & Amri A. (2011) Focused identification of germplasm strategy (FIGS) detects wheat stem rust resistance linked to environmental variables. Genetic Resources and Crop Evolution http://www.springerlink.com/content/m7140x68v2065113/fulltext.pdf Grain Research & Development Corporation
  • 33. Principal Component Analysis (PCA) • Principal component analysis (PCA) • Partial Least Square (PLS) • Random Forest (RF) • Support Vector Machines (SVM) • Neural Networks (NN) B a matrix of coefficients. The prediction was initially carried out using the number of components (PCs) that account for 95% of explained variance. Followed by adding a component at a time till the error reached a minimum Grain Research & Development 33 Corporation
  • 34. Partial Least Square (PLS) • Principal component analysis (PCA) • Partial Least Square (PLS) • Random Forest (RF) • Support Vector Machines (SVM) • Neural Networks (NN) PLS : A product of factors and their loadings (regression coefficients) where both environmental dataset and trait dataset simultaneously The prediction was initially carried out using the number of components (PCs) that account for 95% of explained variance. Followed by adding a component at a time till the error reached a minimum Grain Research & Development 34 Corporation
  • 35. Random Forest (RF) • Principal component analysis (PCA) • Partial Least Square (PLS) • Random Forest (RF) Data • Support Vector Machines (SVM) • Neural Networks (NN) Bootstrapping (with replacement) Training (set) Out-of-bag (set) OOB ntree 1 ntree 2 ntree 1000 Grain Research & Development 35 Corporation
  • 36. Support Vector Machines (SVM) • Principal component analysis (PCA) • Partial Least Square (PLS) SVM a learning-based technique that maps • Random Forest (RF) input data to a high-dimensional space. • Support Vector Machines (SVM) • Neural Networks (NN) Optimally separates mapped input into respective classes v v (x) v (x) v (x) (x) (x) From l-dimensional space (input variable space) into k-dimensional space, where k is more higher than l. Grain Research & Development 36 Corporation
  • 37. Neural Networks (NN) • Principal component analysis (PCA) • Partial Least Square (PLS) Neural Networks (RBF) • Random Forest (RF) • Support Vector Machines (SVM) • Neural Networks (NN) error Test set x1 x2 F(x) Training set xp epochs number Grain Research & Development 37 Corporation
  • 38. Optimization/tuning error Test set Training set PCs, LVs or epochs number Trend of output error versus the number of components(PCs/LVs) or epochs (NN) Grain Research & Development Corporation
  • 39. Accuracy metrics Parameters that provide information on the specificity (“trait agro-climate”) Confusion matrix (2-by-2 contingency table) Observed Resistant Susceptible Predicted Resistant a b Susceptible c d Sensitivity a/ (a + c) = Specificity d/(b + d) = and are indicators of the models ability to correctly classify observations. Grain Research & Development Corporation
  • 40. Accuracy metrics Parameters that provide information on the specificity (“trait agro-climate”) .. High AUC (area) values indication of potential trait-environment relationship 1- ROC curve pdf’s of trait distribution 1 1 Grain The ROC curve and the resulting pdf’s of trait distribution (trait states) Research & Development Corporation
  • 41. Accuracy metrics Randomness 1- ROC curve pdf’s of trait distribution 1 1 Grain Research & Development Corporation
  • 42. Content • Background – PGR traits – FIGS • Objective – Develop a priori information – Develop best bet subset of accs with traits • Datasets – Trait data – Environmental data • Methodologies – Data preparation – Modeling techniques • Results/Discussion – Sub-setting (accessions/variables) – “Hot spots” Grain Research & • Conclusion Development 42 Corporation
  • 43. Data preparation - Raw data PCs = 42 1.0 1 0.46 0.71 0.8 True positive rate 0.44 0.6 RMSE 0.13 0.4 0.42 0.2 0.40 -0.45 0.0 0 10 20 30 40 50 60 0.0 0.2 0.4 0.6 0.8 1.0 Number of components False positive rate Distribution by trait 2.0 AUC = 0.67 1.5 Density Kappa = 0.40 1.0 0.5 0.0 -0.5 0.0 0.5 1.0 Grain Research & Development Corporation
  • 44. Data preparation – Transformed data PCs = 42 1.0 0.46 0.59 0.8 True positive rate 0.44 0.6 RMSE 0.42 0.03 0.4 0.2 0.40 -0.54 0.0 0 10 20 30 40 50 60 0.0 0.2 0.4 0.6 0.8 1.0 Number of components False positive rate Distribution by trait 2.0 1.5 AUC = 0.71 Density 1.0 Kappa = 0.45 0.5 0.0 -0.5 0.0 0.5 1.0 Grain Research & Development Corporation
  • 45. Data preparation - Raw data (PLS) LVs = 30 1.0 0.46 0.68 0.8 True positive rate 0.44 0.6 RMSE 0.07 0.4 0.42 0.2 0.40 -0.55 0.0 0 10 20 30 40 50 60 0.0 0.2 0.4 0.6 0.8 1.0 Number of components False positive rate Distribution by trait 2.0 AUC = 0.70 1.5 Density Kappa = 0.43 1.0 0.5 0.0 -1.0 -0.5 0.0 0.5 1.0 Grain Research & Development Corporation
  • 46. Data preparation – Transformed data LVs = 22 0.6 0.85 1.0 0.46 0.8 True positive rate 0.44 0.6 RMSE 0.42 0.09 0.4 0.2 0.40 -0.42 0.0 0 10 20 30 40 50 60 0.0 0.2 0.4 0.6 0.8 1.0 Number of components False positive rate Distribution by trait 2.0 AUC = 0.71 1.5 Density 1.0 Kappa = 0.44 0.5 0.0 -0.5 0.0 0.5 1.0 Grain Research & Development Corporation
  • 47. Optimization process R_CALC R_CALC 0.46 0.46 0.44 0.44 RMSEP RMSEP 0.42 0.42 0.40 0.40 0 10 20 30 40 50 60 0 10 20 30 40 50 60 number of components number of components Mean square error (RMSEP) for PCA (left) and PLS (right) models. Arrow indicate minimum errors where the number of components (PCs and LVs) were selected for prediction (red/discount nous = test data, continuous line = training set) Grain Research & Development 47 Corporation
  • 48. PCA PC2 Few components  ~ random Distribution per R_CALC 1.0 12 Resistant 0.8 Susceptible 10 True positive rate 0.6 8 Density 6 0.4 4 0.2 2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 0 0.2 0.3 0.4 0.5 False positive rate ... Grain Research & Development 48 Corporation
  • 49. PCA PC5 Distribution per R_CALC 1.0 4 Resistant Susceptible 0.8 3 True positive rate 0.6 Density 2 0.4 1 0.2 0.0 0.0 0.2 0.4 0.6 0.8 1.0 0 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 False positive rate ... Grain Research & Development 49 Corporation
  • 50. PLS LV2 2 latent variables of PLS are better than 2 PCs of PCA Distribution per R_CALC 1.0 4 Resistant Susceptible 0.8 3 True positive rate 0.6 Density 2 0.4 1 0.2 0.0 0 0.0 0.2 0.4 0.6 0.8 1.0 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0 False positive rate ... Grain Research & Development 50 Corporation
  • 51. PLS LV10 Distribution per R_CALC 1.0 Resistant 2.0 0.8 Susceptible True positive rate 0.6 1.5 Density 0.4 1.0 0.2 0.5 0.0 0.0 0.0 0.2 0.4 0.6 0.8 1.0 False positive rate -0.5 0.0 0.5 1.0 ... Grain Research & Development 51 Corporation
  • 52. PCA (optimized) • Principal component analysis (PCA) • Partial Least Square (PLS) • Random Forest (RF) • Support Vector Machines (SVM) • Neural Networks (NN) ROC curve 1.0 2.0 True positive rate 0.8 1.5 Density 0.6 1.0 0.4 0.5 0.2 0.0 0.0 0.0 0.4 0.8 -0.5 0.0 0.5 1.0 False positive rate Prediction Grain Research & Development Corporation
  • 53. PLS (optimized) • Principal component analysis (PCA) • Partial Least Square (PLS) • Random Forest (RF) • Support Vector Machines (SVM) • Neural Networks (NN) ROC curve 1.0 2.0 True positive rate 0.8 1.5 Density 0.6 1.0 0.4 0.5 0.2 0.0 0.0 0.0 0.4 0.8 -0.5 0.0 0.5 1.0 False positive rate Prediction Grain Research & Development Corporation
  • 54. RF • Principal component analysis (PCA) • Partial Least Square (PLS) • Random Forest (RF) • Support Vector Machines (SVM) • Neural Networks (NN) ROC curve 3.0 1.0 2.5 True positive rate 0.8 2.0 Density 0.6 1.5 0.4 1.0 0.2 0.5 0.0 0.0 0.0 0.4 0.8 0.0 0.5 1.0 False positive rate Prediction Grain Research & Development Corporation
  • 55. SVM • Principal component analysis (PCA) • Partial Least Square (PLS) • Random Forest (RF) • Support Vector Machines (SVM) • Neural Networks (NN) ROC curve 1.0 4 True positive rate 0.8 3 Density 0.6 2 0.4 1 0.2 0.0 0 0.0 0.4 0.8 0.0 0.5 1.0 False positive rate Prediction Grain Research & Development Corporation
  • 56. NN • Principal component analysis (PCA) • Partial Least Square (PLS) • Random Forest (RF) • Support Vector Machines (SVM) • Neural Networks (NN) ROC curve 1.0 3.0 True positive rate 0.8 2.5 Density 2.0 0.6 1.5 0.4 1.0 0.2 0.5 0.0 0.0 0.0 0.4 0.8 -0.2 0.2 0.6 1.0 False positive rate Prediction Grain Research & Development Corporation
  • 57. Random (PCA) R_CALC 0.470 1.0 Complete random 0.8 distribution 0.465 True positive rate 0.6 RMSEP of trait of 0.4 stem rust 0.460 resistance 0.2 AUC ~ 0.5 0.0 0.0 0.2 0.4 0.6 0.8 1.0 0 10 20 30 40 50 60 False positive rate number of components 1.0 0.1 0.465 0.2 Partially 0.460 0.8 0.3 random 0.455 True positive rate 0.6 0.4 0.450 distribution RMSE of trait of 0.4 0.445 0.5 stem rust 0.2 0.440 resistance 0.6 0.435 0.0 0.8 0.7 0.0 0.2 0.4 0.6 0.8 1.0 Grain False positive rate 0 10 20 30 40 50 60 Research & Number of components Development 57 Corporation
  • 58. Stem rust hot spots 60 50 40 Latitude 30 20 10 0 0 50 100 150 Longitude Grain Research & Development Corporation
  • 59. Stem rust hot spots areas where resistance is latitude 60 50 40 likely to occur (longitude wise) Latitude 30 1 20 10 0 0 50 100 150 60 Longitude b 50 40 Latitude 30 20 10 Grain 0 Research & Development Corporation 0 50 longitude Longitude 100 150
  • 60. PLS (optimized) Areas where resistance is likely to occur (dark red) 60 -0.2 0.8 0 50 0.2 0.6 2 -0. 0.4 0 0.6 -0.2 0.6 Latitude 40 0.2 0 0.2 0.4 0.6 0.4 Y 0.6 30 0 0.2 0.6 0.4 0.2 20 0 0 0.0 0 0 0.2 0.4 -0.2 10 0.4 0.08 0 20 40 60 80 100 120 Longitude 0.06 X semivariance 0.04 0.02 Grain Research & 10 20 30 40 Development distance Corporation
  • 61. Random Forest (RF) Areas where resistance is likely to occur (dark red) 60 0.4 50 0.8 0.2 0 0.4 0 0.6 0.8 0.6 Latitude 40 0.2 0.4 0.6 0.2 0.6 0 0.4 0.2 0.4 0.2 Y 0 0.4 30 0.6 0.4 0.6 0.6 0.2 20 0.0 10 0.2 0.4 0.15 0 20 40 60 80 100 120 Longitude X 0.10 semivariance 0.05 Grain Research & Development 10 20 distance 30 40 Corporation
  • 62. svm Areas where resistance is likely to occur (dark red) 60 1.0 50 0 0 0.8 0.6 0.6 0 0.6 Latitude 40 0.2 0.4 0.4 0.6 1 0.6 0.8 0 0.2 0 0.2 0 0.8 Y 0.6 30 0.2 0.4 0.8 0.6 0.6 0.4 0.4 0.2 20 0 0.4 0.2 0.0 10 0 0.4 0.2 0 20 40 60 80 100 120 Longitude X Grain Research & Development Corporation
  • 63. Content • Background – PGR traits – FIGS • Objective – Develop a priori information – Develop best bet subset of accs with traits • Datasets – Trait data – Environmental data • Methodologies – Data preparation – Modeling techniques • Results/Discussion – Sub-setting (accessions/variables) – “Hot spots” Grain Research & • Conclusion Development 63 Corporation
  • 64. Results – stem rust on wheat Dataset (unit) PPV LR+ Estimated gain Stem rust 0.54 (0.50-0.59) 3.07 (2.66-3.54) 1.95 (1.79-2.09) (accession) Random 0.29 (0.26-0.33) 1.04 (0.90-1.20) 1.03 (0.91-1.16) (28 % resistant samples) Stem rust (site) 0.50 (0.40-0.60) 4.00 (2.85-5.66) 2.51 (2.02-2.98) Random 0.19 (0.13-0.26) 0.94 (0.63-1.39) 0.95 (0.66-1.33) (20 % resistant samples) PPV = Positive Predictive Value; LR+ = Positive Diagnostic Likelihood Ratio Endresen, D.T.F., K. Street, M. Mackay, A. Bari, E. De Pauw (2011). Predictive association between biotic stress traits and ecogeographic data for wheat and barley Grain Research & landraces. Crop Science 51: 2036-2055. DOI: 10.2135/cropsci2010.12.0717 Development Corporation 64
  • 65. Results – stem rust on wheat AUC = Area Under the ROC Curve (ROC, Receiver Operating Curve) Classifier method AUC Cohen’s Kappa Principal Component Regression 0.69 (0.68-0.70) 0.40 (0.37-0.42) (PCR) Partial Least Squares (PLS) 0.69 (0.68-0.70) 0.41 (0.39-0.43) Random Forest (RF) 0.70 (0.69-0.71) 0.42 (0.40-0.44) Support Vector Machines (SVM) 0.71 (0.70-0.72) 0.44 (0.42-0.45) Artificial Neural Networks (ANN) 0.71 (0.70-0.72) 0.44 (0.42-0.46) Bari, A., K. Street, , M. Mackay, D.T.F. Endresen, E. De Pauw, and A. Amri (2011). Focused Identification of Germplasm Strategy (FIGS) detects wheat stem rust resistance linked to Grain environment variables. Genetic Resources and Crop Evolution [online first]. doi:10.1007/s10722- Research & 011-9775-5; Published online 3 Dec 2011. Development Corporation 65
  • 66. Results – stem rust on wheat Classifier method PPV LR+ Estimated gain kNN (pre-study) 0.29 (0.13-0.53) 5.61 (2.21-14.28) 4.14 (1.86-7.57) SIMCA 0.28 (0.14-0.48) 5.26 (2.51-11.01) 4.00 (2.00-6.86) Ensemble classifier 0.33 (0.12-0.65) 8.09 (2.23-29.42) 6.47 (2.05-11.06) Random 0.06 (0.01-0.27) 0.95 (0.13-6.73) 0.97 (0.16-4.35) (pre-study, 550 + 275 accessions) Ensemble 0.26 (0.22-0.30) 2.78 (2.34-3.31) 2.32 (2.00-2.68) Random 0.11 (0.09-0.15) 1.02 (0.77-1.36) 0.95 (0.77-1.32) (blind study, 825 + 3738 accessions) PPV = Positive Predictive Value; LR+ = Positive Diagnostic Likelihood Ratio Endresen, D.T.F., K. Street, M. Mackay, A. Bari, E. De Pauw, K. Nazari, and A. Yahyaoui (2012). Sources of Resistance to Stem Rust (Ug99) in Bread Wheat and Durum Wheat Identified Using Focused Identification of Germplasm Strategy (FIGS). Crop Science Grain [online first]. doi: 10.2135/cropsci2011.08.0427; Published online 8 Dec 2011. Research & Development Corporation 66
  • 67. Results of stem rust (Ug99) on wheat 4563 wheat landraces screened for Ug99 10.2 % resistant accessions. The true trait scores for 20% of the accessions (825 samples) 500 accessions more likely to be resistant from 3728 accession with true scores hidden 25.8 % resistant samples and thus 2.3 times higher than expected by chance. Grain Research & Development Corporation 67
  • 68. Content • Background – PGR traits – FIGS • Objective – Develop a priori information – Develop best bet subset of accs with traits • Datasets – Trait data – Environmental data • Methodologies – Data preparation – Modeling techniques • Results/Discussion – Sub-setting (accessions/variables) – “Hot spots” Grain Research & • Conclusion Development 68 Corporation
  • 69. Conclusion ... Results – Raw data vs Transformed data – PLS vs PCA – Non-linear vs linear – FIGS vs random (selection) Issues – Extent of variables (trait/agro-climate) – Phenology (adaptation) – Fuzzy approach (trait variation capture) Grain Research & Development 69 Corporation

Editor's Notes

  1. Landrace samples (genebank seed accessions)Trait observations (experimental design) - High cost dataClimate data (for the landrace location of origin) - Low cost dataThe accession identifier (accession number) provides the bridge to the crop trait observations.The longitude, latitude coordinates for the original collecting site of the accessions (landraces) provide the bridge to the environmental data.
  2. GRIN database (USDA-ARS, National Plant Germplasm System, Germplasm Resources Information Network, online http://www.ars-grin.gov/npgs) USDA GRIN, trait data online: http://www.ars-grin.gov/cgi-bin/npgs/html/desc.pl?65049
  3. Photo: USDA ARS Image k1192-1, http://www.ars.usda.gov/is/graphics/photos/mar09/k11192-1.htm
  4. USDA ARS Image Archive, http://www.ars.usda.gov/is/graphics/photos/
  5. Photo: Wheat infected by stem rust (Ug99) at the Kenya Agricultural Research Station in Njoro northwest of Nairobi.
  6. Endresen, D.T.F., K. Street, M. Mackay, A. Bari, E. De Pauw, K. Nazari, and A. Yahyaoui (2012). Sources of Resistance to Stem Rust (Ug99) in Bread Wheat and Durum Wheat Identified Using Focused Identification of Germplasm Strategy (FIGS). Crop Science [online first]. doi: 10.2135/cropsci2011.08.0427; Published online 8 Dec 2011.