SlideShare una empresa de Scribd logo
1 de 44
Descargar para leer sin conexión
Population Structure Analysis
                                                using STRUCTURE software



                                                       Chang Bum Hong



                        kt Bioinformatics TF, hongiiv@gmail.com, twitter @hongiiv, hongiiv.tistory.com

            Permissions: you are free to blog    or live-blog   about this presentation as long as you attribute the work to its authors
Friday, August 12, 11
Genetic test



      일반적으로 알콜을 섭취하게 되면 알콜은 아세트알데히드(얼굴을 붉게 만들고, 가슴도 콩닥
      거리고, 구토를 일으키는 독성 물질)로 변하게 되고 이것이 다시 ALDH 에 의해 인체에 무해
      한 젖산으로 분해되는 과정을 거치게 됩니다. 이때 ALDH2라는 유전자가 바로 아세트알데히
      드가 조금이라도 생성되면 분해하는데 관여하게 이때 유전자형에 따라서 3가지 유형으로 나
                            타나게 됩니다.




Friday, August 12, 11
23andMe




Friday, August 12, 11
북서유럽




                        남동유럽



Friday, August 12, 11
HGDP(Human Genome Diversity Project)



                        Text




Friday, August 12, 11
PASNP(Pan-Asian SNP Consortium)



                        Text




Friday, August 12, 11
East Asia - Public genotype data
                                                    SNP                      Individual   Population
                           PASNP                   54,794                       1,928         75
                            HGDP a                 2,834~                       1,056         52
                           HapMap                 1,481,135                     1,397         11
                                   b
                            SGVP                   268,667                       292          3
                           Korean                  58,625                        159          10
                        China(Yanbian)             58,625                        16           1
                         Japan(Kobe)               58,625                            5        1
                         Korea-Japan               58,625                            6        1
                           Vietnam                 58,625                        16           1
                        Korean-Vietnam             58,625                            8        1
                          Cambodia                 58,625                        16           1
                           Mongol                  58,625                        16           1
             a. Pan-Asian SNP Consortium(http://www4a.biotec.or.th/PASNP)
             b. Singapore Genome Variation Project(http://www.nus-cme.org.sg/SGVP)




Friday, August 12, 11
Korean Data
                                             16
                                       YeonCheon


                                                             16
                                                            Pyeong
                                                            Chang




                                                                    MW
                                                       JeCheon
                                             16             16
                                             Cheonan
                                                                                     average >70 year old
                                                                                     long settlement
                                                                                     Affymetrix 50K Xba
                                                                     GyeongJu
                                            16                                  16
                                           GimJe          15                             China(Yanbian)
                                                         Goryeong      UlSan
                                                                                          Japan(Kobe)
                                                                                16        Korea-Japan
                                                                                            Vietnam
                                                                                         Korean-Vietnam

                        SW             16
                                    NaJu


                                                                        SE
                                                                                           Cambodia
                                                                                             Mongol




                             16                                                            58,960 SNPs
                             Jeju




Friday, August 12, 11
Missing genotype individuals
                                                                       GimJe




                                                                        GoRyeong
                                                           Gyeong
                                                Text         Ju




                        Before QC 58,960 SNPs          Before QC 58,960 SNPs
                               All Asian                      Korean

Friday, August 12, 11
Relatedness between the 153
                        Korean(10 region) Individuals
                                                                                   YeonCheon
                                                                                               PyeongChang



                                                                                                JeCheon




                                                                                     CheonAn              GyeongJu



                                                                                                             UlSan
                                                                                     GimJe     GoRyeong




                                                                                  NaJu




                                                                           JeJu




                        PCA analysis using autosomal 46,559 SNP markers (n=153, Korean)
Friday, August 12, 11
PCA analysis of East Asian descent
                                                                      Mongol


                                                                                          Yanbian




                                                                                                 Kobe JPT-
                                                                                          Jeju        HapMap

                                                                                 CHB-
                                                                                 HapMap



                                                                            Vietnam

                                                                 Cambodia
     illustration of geographic correspondence of ethnic group
                                                                    Korea-Vietnam            Korea-Japan
     locations
Friday, August 12, 11
Relationship between Eigenvector
      values and Latitude
                                              47.81
                                              39.98
                                              37.53



                         2
                        R = 0.8621
                        y = 36.65 + 166.33x
                                              14.72




Friday, August 12, 11
STRUCTURE software
         •    A model-based clustering method (Pritchard et al. 2000)

               •        Free software
                        (http://pritch.bsd.uchicago.edu/structure.html)

               •        Bayesian approach (MCMC: Markov Chain Mote Carlo)

               •        Detects the underlying genetic population among a set of individuals genotyped at multiple
                        markers

               •        Computes the proportion of the genome of an individual originating from each inferred
                        population (quantitative clustering method)




Friday, August 12, 11
Input data
               • A matrix where the data for individuals are in rows, the loci
                 are in column
                 • n consecutive rows have the data for each individual of n-
                   ploid species
                 • Integer should be used for coding genotype
                 • Missingoccur should be indicated by(e.g. -1) which
                   doesn’t
                            data
                                   elsewhere in the data
                                                         a number

                 • The dataSTRUCTUREbe a text file (.txt) not an excel (.xls) for
                   running
                             file should




Friday, August 12, 11
Input format
    1 consecutive rows for alleles
                                                             MarkerName...
                                                             Label PopID Flag Location Genotype...


                                                              genotype (1,2,5)
                                                                 AA = 11
                                                                 AB = 12
                                                                 BB = 22
                                                               missing = 55




      Information of user-defined populations
      Lable : 각 개인의 고유한 ID로 숫자 또는 문자 어떤것이든 상관없다.(예, CEPH1334.10)
      PopID: 개인이 속한 민족의 고유한 번호 (예, 중국인(CHB)인 경우 5, 유럽인(CEU)인 경우 1과 같이 자신이 직접 부여)
      Flag: 해당 PopID 정보를 STRUCTURE 프로그램 실행시 사용할 것인가?(1= 사용한다, 2= 사용하지 않는다.)
      Location: 해당 개인의 위치정보(예, 동아시아(EAS)인경우 1번, 유럽(EURA)인 경우 2번과 같이 자신이 직접 부여)
Friday, August 12, 11
Input format (cont.)




Friday, August 12, 11
Running STRUCTURE from a graphical
             interface, Front End




Friday, August 12, 11
Importing input data into a project




Friday, August 12, 11
Importing input data into a project (cont.)




Friday, August 12, 11
Importing input data into a project (cont.)




Friday, August 12, 11
Importing input data into a project (cont.)




Friday, August 12, 11
Importing input data into a project (cont.)




Friday, August 12, 11
Importing input data into a project (cont.)




Friday, August 12, 11
Importing input data into a project (cont.)




Friday, August 12, 11
Importing input data into a project (cont.)




Friday, August 12, 11
Configuring a parameter set




Friday, August 12, 11
Configuring a parameter set (cont.)




        Length of Burnin Period : how long to run the simulation before collecting data to minimize the
        effect of the starting configuration, 목표함수로 수렴할 때까지의 반복 숫자
        Number of MCMC Reps after Burnin : how long to run the simulation after burnin to get
        accurate parameter estimates
Friday, August 12, 11
Configuring a parameter set (cont.)




Friday, August 12, 11
Configuring a parameter set (cont.)




Friday, August 12, 11
Configuring a parameter set (cont.)




Friday, August 12, 11
Running STRUCTURE: a single run




Friday, August 12, 11
Running STRUCTURE: a single run (cont.)




Friday, August 12, 11
Running STRUCTURE: a batch run




Friday, August 12, 11
Running STRUCTURE: a batch run (cont.)




Friday, August 12, 11
Ln P(D): Estimated probability of Ks




Friday, August 12, 11
Friday, August 12, 11
Analysis of genome-wide SNP data
               • For very may become impractically slow
                 settings
                           large data sets, the runtime of structure using default

                  • reduced data sets (ex, pruned)
                  • get accurate resultsNUMREPS) shorter runs than default
                    (ex, small values of
                                         using much

                  • download themachine) and compile it on your machine
                    (using 64-bit
                                    source code

                  • use the command-line version of structure



Friday, August 12, 11
An example of MCMC convergence




Friday, August 12, 11
Inference of true K
                                (number of population)

               • The log likelihood for each K, Ln P(D) = L(K)
               • Two approaches to determine the best K
               • Use of L(K) : When K is approaching a true value, L(K) plateaus
                 and has high variance between runs
               • Use of an ad hod quantity (∆K) the likelihood (∆K).on the
                 second order rate of change of
                                                  : calculated based
                                                                      The ∆K
                        shows a clear peak at the true value of K




Friday, August 12, 11
Friday, August 12, 11
Simulation Result


                                   Q-metrix
                                   an individuals belongs to a subpopulation




Friday, August 12, 11
Simulation Result (cont.)




Friday, August 12, 11
Enjoy running STRUCTURE




Friday, August 12, 11
We may not always be able to know the TRUE value
   K, but we should aim for the smallest value of K
     that captures the major structure in the data

                              Pritchard et al. (2000)




Friday, August 12, 11

Más contenido relacionado

Destacado

Galaxy RNA-Seq Analysis: Tuxedo Protocol
Galaxy RNA-Seq Analysis: Tuxedo ProtocolGalaxy RNA-Seq Analysis: Tuxedo Protocol
Galaxy RNA-Seq Analysis: Tuxedo ProtocolHong ChangBum
 
Electrophysiology meets Optogenetics
Electrophysiology meets Optogenetics  Electrophysiology meets Optogenetics
Electrophysiology meets Optogenetics andortech
 
Perspectives of identifying Korean genetic variations
Perspectives of identifying Korean genetic variationsPerspectives of identifying Korean genetic variations
Perspectives of identifying Korean genetic variationsHong ChangBum
 
worldwide population
worldwide populationworldwide population
worldwide populationHong ChangBum
 
11.02.14 - LCBG Journal Club
11.02.14 - LCBG Journal Club11.02.14 - LCBG Journal Club
11.02.14 - LCBG Journal ClubFarhoud Faraji
 
Computational genomics approaches to precision medicine
Computational genomics approaches to precision medicineComputational genomics approaches to precision medicine
Computational genomics approaches to precision medicineAltuna Akalin
 
Computational genomics course poster 2015 (BIMSB/MDC-Berlin)
Computational genomics course poster 2015 (BIMSB/MDC-Berlin)Computational genomics course poster 2015 (BIMSB/MDC-Berlin)
Computational genomics course poster 2015 (BIMSB/MDC-Berlin)Altuna Akalin
 
영어로 논문쓰기 - 읽기 쓰기 통합 전략을 중심으로
영어로 논문쓰기 - 읽기 쓰기 통합 전략을 중심으로영어로 논문쓰기 - 읽기 쓰기 통합 전략을 중심으로
영어로 논문쓰기 - 읽기 쓰기 통합 전략을 중심으로Sungwoo Kim
 

Destacado (8)

Galaxy RNA-Seq Analysis: Tuxedo Protocol
Galaxy RNA-Seq Analysis: Tuxedo ProtocolGalaxy RNA-Seq Analysis: Tuxedo Protocol
Galaxy RNA-Seq Analysis: Tuxedo Protocol
 
Electrophysiology meets Optogenetics
Electrophysiology meets Optogenetics  Electrophysiology meets Optogenetics
Electrophysiology meets Optogenetics
 
Perspectives of identifying Korean genetic variations
Perspectives of identifying Korean genetic variationsPerspectives of identifying Korean genetic variations
Perspectives of identifying Korean genetic variations
 
worldwide population
worldwide populationworldwide population
worldwide population
 
11.02.14 - LCBG Journal Club
11.02.14 - LCBG Journal Club11.02.14 - LCBG Journal Club
11.02.14 - LCBG Journal Club
 
Computational genomics approaches to precision medicine
Computational genomics approaches to precision medicineComputational genomics approaches to precision medicine
Computational genomics approaches to precision medicine
 
Computational genomics course poster 2015 (BIMSB/MDC-Berlin)
Computational genomics course poster 2015 (BIMSB/MDC-Berlin)Computational genomics course poster 2015 (BIMSB/MDC-Berlin)
Computational genomics course poster 2015 (BIMSB/MDC-Berlin)
 
영어로 논문쓰기 - 읽기 쓰기 통합 전략을 중심으로
영어로 논문쓰기 - 읽기 쓰기 통합 전략을 중심으로영어로 논문쓰기 - 읽기 쓰기 통합 전략을 중심으로
영어로 논문쓰기 - 읽기 쓰기 통합 전략을 중심으로
 

Más de Hong ChangBum

통계유전학워크샵
통계유전학워크샵통계유전학워크샵
통계유전학워크샵Hong ChangBum
 
Genome Wide SNP Analysis for Inferring the Population Structure and Genetic H...
Genome Wide SNP Analysis for Inferring the Population Structure and Genetic H...Genome Wide SNP Analysis for Inferring the Population Structure and Genetic H...
Genome Wide SNP Analysis for Inferring the Population Structure and Genetic H...Hong ChangBum
 
BioSMACK - Linux Live CD for GWAS
BioSMACK - Linux Live CD for GWASBioSMACK - Linux Live CD for GWAS
BioSMACK - Linux Live CD for GWASHong ChangBum
 
Next-generation genomics: an integrative approach
Next-generation genomics: an integrative approachNext-generation genomics: an integrative approach
Next-generation genomics: an integrative approachHong ChangBum
 
RSS & Bioinformatics
RSS & BioinformaticsRSS & Bioinformatics
RSS & BioinformaticsHong ChangBum
 
Genome Browser based on Google Maps API
Genome Browser based on Google Maps APIGenome Browser based on Google Maps API
Genome Browser based on Google Maps APIHong ChangBum
 
Korean Database of Genomic Variants
Korean Database of Genomic VariantsKorean Database of Genomic Variants
Korean Database of Genomic VariantsHong ChangBum
 
Next Generation bio Research Infra
Next Generation bio Research InfraNext Generation bio Research Infra
Next Generation bio Research InfraHong ChangBum
 
Linux Cluster and Distributed Resource Manager
Linux Cluster and Distributed Resource ManagerLinux Cluster and Distributed Resource Manager
Linux Cluster and Distributed Resource ManagerHong ChangBum
 

Más de Hong ChangBum (20)

Demo chapter3
Demo chapter3Demo chapter3
Demo chapter3
 
통계유전학워크샵
통계유전학워크샵통계유전학워크샵
통계유전학워크샵
 
Genome Wide SNP Analysis for Inferring the Population Structure and Genetic H...
Genome Wide SNP Analysis for Inferring the Population Structure and Genetic H...Genome Wide SNP Analysis for Inferring the Population Structure and Genetic H...
Genome Wide SNP Analysis for Inferring the Population Structure and Genetic H...
 
BioSMACK - Linux Live CD for GWAS
BioSMACK - Linux Live CD for GWASBioSMACK - Linux Live CD for GWAS
BioSMACK - Linux Live CD for GWAS
 
Next-generation genomics: an integrative approach
Next-generation genomics: an integrative approachNext-generation genomics: an integrative approach
Next-generation genomics: an integrative approach
 
How to genome
How to genomeHow to genome
How to genome
 
RSS & Bioinformatics
RSS & BioinformaticsRSS & Bioinformatics
RSS & Bioinformatics
 
Genome Browser based on Google Maps API
Genome Browser based on Google Maps APIGenome Browser based on Google Maps API
Genome Browser based on Google Maps API
 
Korean Database of Genomic Variants
Korean Database of Genomic VariantsKorean Database of Genomic Variants
Korean Database of Genomic Variants
 
Dt Ccompanieslist
Dt CcompanieslistDt Ccompanieslist
Dt Ccompanieslist
 
DTC Companies List
DTC Companies ListDTC Companies List
DTC Companies List
 
My Project
My ProjectMy Project
My Project
 
Genome Browser
Genome BrowserGenome Browser
Genome Browser
 
GenomeBrowser
GenomeBrowserGenomeBrowser
GenomeBrowser
 
Desire
DesireDesire
Desire
 
Next Generation bio Research Infra
Next Generation bio Research InfraNext Generation bio Research Infra
Next Generation bio Research Infra
 
Cluster Drm
Cluster DrmCluster Drm
Cluster Drm
 
Cluster Drm
Cluster DrmCluster Drm
Cluster Drm
 
Platform Day
Platform DayPlatform Day
Platform Day
 
Linux Cluster and Distributed Resource Manager
Linux Cluster and Distributed Resource ManagerLinux Cluster and Distributed Resource Manager
Linux Cluster and Distributed Resource Manager
 

Último

Kanpur Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Kanpur Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceKanpur Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Kanpur Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceDamini Dixit
 
Visa Consultant in Lahore || 📞03094429236
Visa Consultant in Lahore || 📞03094429236Visa Consultant in Lahore || 📞03094429236
Visa Consultant in Lahore || 📞03094429236Sherazi Tours
 
Experience the Magic of Saint Martin and Sint Maarten with Find American Rent...
Experience the Magic of Saint Martin and Sint Maarten with Find American Rent...Experience the Magic of Saint Martin and Sint Maarten with Find American Rent...
Experience the Magic of Saint Martin and Sint Maarten with Find American Rent...Find American Rentals
 
Hire 💕 8617697112 Chamba Call Girls Service Call Girls Agency
Hire 💕 8617697112 Chamba Call Girls Service Call Girls AgencyHire 💕 8617697112 Chamba Call Girls Service Call Girls Agency
Hire 💕 8617697112 Chamba Call Girls Service Call Girls AgencyNitya salvi
 
9 Days Kenya Ultimate Safari Odyssey with Kibera Holiday Safaris
9 Days Kenya Ultimate Safari Odyssey with Kibera Holiday Safaris9 Days Kenya Ultimate Safari Odyssey with Kibera Holiday Safaris
9 Days Kenya Ultimate Safari Odyssey with Kibera Holiday SafarisKibera Holiday Safaris Safaris
 
💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh Haldighati
💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh Haldighati💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh Haldighati
💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh HaldighatiApsara Of India
 
best weekend places near delhi where you should visit.pdf
best weekend places near delhi where you should visit.pdfbest weekend places near delhi where you should visit.pdf
best weekend places near delhi where you should visit.pdftour guide
 
Night 7k Call Girls Noida Sector 93 Escorts Call Me: 8448380779
Night 7k Call Girls Noida Sector 93 Escorts Call Me: 8448380779Night 7k Call Girls Noida Sector 93 Escorts Call Me: 8448380779
Night 7k Call Girls Noida Sector 93 Escorts Call Me: 8448380779Delhi Call girls
 
08448380779 Call Girls In Chirag Enclave Women Seeking Men
08448380779 Call Girls In Chirag Enclave Women Seeking Men08448380779 Call Girls In Chirag Enclave Women Seeking Men
08448380779 Call Girls In Chirag Enclave Women Seeking MenDelhi Call girls
 
visa consultant | 📞📞 03094429236 || Best Study Visa Consultant
visa consultant | 📞📞 03094429236 || Best Study Visa Consultantvisa consultant | 📞📞 03094429236 || Best Study Visa Consultant
visa consultant | 📞📞 03094429236 || Best Study Visa ConsultantSherazi Tours
 
Top 10 Traditional Indian Handicrafts.pptx
Top 10 Traditional Indian Handicrafts.pptxTop 10 Traditional Indian Handicrafts.pptx
Top 10 Traditional Indian Handicrafts.pptxdishha99
 
BERMUDA Triangle the mystery of life.pptx
BERMUDA Triangle the mystery of life.pptxBERMUDA Triangle the mystery of life.pptx
BERMUDA Triangle the mystery of life.pptxseri bangash
 
char Dham yatra, Uttarakhand tourism.pptx
char Dham yatra, Uttarakhand tourism.pptxchar Dham yatra, Uttarakhand tourism.pptx
char Dham yatra, Uttarakhand tourism.pptxpalakdigital7
 
Study Consultants in Lahore || 📞03094429236
Study Consultants in Lahore || 📞03094429236Study Consultants in Lahore || 📞03094429236
Study Consultants in Lahore || 📞03094429236Sherazi Tours
 
08448380779 Call Girls In Shahdara Women Seeking Men
08448380779 Call Girls In Shahdara Women Seeking Men08448380779 Call Girls In Shahdara Women Seeking Men
08448380779 Call Girls In Shahdara Women Seeking MenDelhi Call girls
 
Genesis 1:6 || Meditate the Scripture daily verse by verse
Genesis 1:6  ||  Meditate the Scripture daily verse by verseGenesis 1:6  ||  Meditate the Scripture daily verse by verse
Genesis 1:6 || Meditate the Scripture daily verse by versemaricelcanoynuay
 
Hire 💕 8617697112 Champawat Call Girls Service Call Girls Agency
Hire 💕 8617697112 Champawat Call Girls Service Call Girls AgencyHire 💕 8617697112 Champawat Call Girls Service Call Girls Agency
Hire 💕 8617697112 Champawat Call Girls Service Call Girls AgencyNitya salvi
 

Último (20)

Kanpur Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Kanpur Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort ServiceKanpur Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
Kanpur Call Girls Service ☎ ️82500–77686 ☎️ Enjoy 24/7 Escort Service
 
Rohini Sector 18 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 18 Call Girls Delhi 9999965857 @Sabina Saikh No AdvanceRohini Sector 18 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
Rohini Sector 18 Call Girls Delhi 9999965857 @Sabina Saikh No Advance
 
Visa Consultant in Lahore || 📞03094429236
Visa Consultant in Lahore || 📞03094429236Visa Consultant in Lahore || 📞03094429236
Visa Consultant in Lahore || 📞03094429236
 
Experience the Magic of Saint Martin and Sint Maarten with Find American Rent...
Experience the Magic of Saint Martin and Sint Maarten with Find American Rent...Experience the Magic of Saint Martin and Sint Maarten with Find American Rent...
Experience the Magic of Saint Martin and Sint Maarten with Find American Rent...
 
Hire 💕 8617697112 Chamba Call Girls Service Call Girls Agency
Hire 💕 8617697112 Chamba Call Girls Service Call Girls AgencyHire 💕 8617697112 Chamba Call Girls Service Call Girls Agency
Hire 💕 8617697112 Chamba Call Girls Service Call Girls Agency
 
9 Days Kenya Ultimate Safari Odyssey with Kibera Holiday Safaris
9 Days Kenya Ultimate Safari Odyssey with Kibera Holiday Safaris9 Days Kenya Ultimate Safari Odyssey with Kibera Holiday Safaris
9 Days Kenya Ultimate Safari Odyssey with Kibera Holiday Safaris
 
💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh Haldighati
💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh Haldighati💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh Haldighati
💕📲09602870969💓Girl Escort Services Udaipur Call Girls in Chittorgarh Haldighati
 
Call Girls Service !! Indirapuram!! @9999965857 Delhi 🫦 No Advance VVVIP 🍎 S...
Call Girls Service !! Indirapuram!! @9999965857 Delhi 🫦 No Advance  VVVIP 🍎 S...Call Girls Service !! Indirapuram!! @9999965857 Delhi 🫦 No Advance  VVVIP 🍎 S...
Call Girls Service !! Indirapuram!! @9999965857 Delhi 🫦 No Advance VVVIP 🍎 S...
 
best weekend places near delhi where you should visit.pdf
best weekend places near delhi where you should visit.pdfbest weekend places near delhi where you should visit.pdf
best weekend places near delhi where you should visit.pdf
 
Night 7k Call Girls Noida Sector 93 Escorts Call Me: 8448380779
Night 7k Call Girls Noida Sector 93 Escorts Call Me: 8448380779Night 7k Call Girls Noida Sector 93 Escorts Call Me: 8448380779
Night 7k Call Girls Noida Sector 93 Escorts Call Me: 8448380779
 
08448380779 Call Girls In Chirag Enclave Women Seeking Men
08448380779 Call Girls In Chirag Enclave Women Seeking Men08448380779 Call Girls In Chirag Enclave Women Seeking Men
08448380779 Call Girls In Chirag Enclave Women Seeking Men
 
Call Girls Service !! New Friends Colony!! @9999965857 Delhi 🫦 No Advance VV...
Call Girls Service !! New Friends Colony!! @9999965857 Delhi 🫦 No Advance  VV...Call Girls Service !! New Friends Colony!! @9999965857 Delhi 🫦 No Advance  VV...
Call Girls Service !! New Friends Colony!! @9999965857 Delhi 🫦 No Advance VV...
 
visa consultant | 📞📞 03094429236 || Best Study Visa Consultant
visa consultant | 📞📞 03094429236 || Best Study Visa Consultantvisa consultant | 📞📞 03094429236 || Best Study Visa Consultant
visa consultant | 📞📞 03094429236 || Best Study Visa Consultant
 
Top 10 Traditional Indian Handicrafts.pptx
Top 10 Traditional Indian Handicrafts.pptxTop 10 Traditional Indian Handicrafts.pptx
Top 10 Traditional Indian Handicrafts.pptx
 
BERMUDA Triangle the mystery of life.pptx
BERMUDA Triangle the mystery of life.pptxBERMUDA Triangle the mystery of life.pptx
BERMUDA Triangle the mystery of life.pptx
 
char Dham yatra, Uttarakhand tourism.pptx
char Dham yatra, Uttarakhand tourism.pptxchar Dham yatra, Uttarakhand tourism.pptx
char Dham yatra, Uttarakhand tourism.pptx
 
Study Consultants in Lahore || 📞03094429236
Study Consultants in Lahore || 📞03094429236Study Consultants in Lahore || 📞03094429236
Study Consultants in Lahore || 📞03094429236
 
08448380779 Call Girls In Shahdara Women Seeking Men
08448380779 Call Girls In Shahdara Women Seeking Men08448380779 Call Girls In Shahdara Women Seeking Men
08448380779 Call Girls In Shahdara Women Seeking Men
 
Genesis 1:6 || Meditate the Scripture daily verse by verse
Genesis 1:6  ||  Meditate the Scripture daily verse by verseGenesis 1:6  ||  Meditate the Scripture daily verse by verse
Genesis 1:6 || Meditate the Scripture daily verse by verse
 
Hire 💕 8617697112 Champawat Call Girls Service Call Girls Agency
Hire 💕 8617697112 Champawat Call Girls Service Call Girls AgencyHire 💕 8617697112 Champawat Call Girls Service Call Girls Agency
Hire 💕 8617697112 Champawat Call Girls Service Call Girls Agency
 

Workshop 2011

  • 1. Population Structure Analysis using STRUCTURE software Chang Bum Hong kt Bioinformatics TF, hongiiv@gmail.com, twitter @hongiiv, hongiiv.tistory.com Permissions: you are free to blog or live-blog about this presentation as long as you attribute the work to its authors Friday, August 12, 11
  • 2. Genetic test 일반적으로 알콜을 섭취하게 되면 알콜은 아세트알데히드(얼굴을 붉게 만들고, 가슴도 콩닥 거리고, 구토를 일으키는 독성 물질)로 변하게 되고 이것이 다시 ALDH 에 의해 인체에 무해 한 젖산으로 분해되는 과정을 거치게 됩니다. 이때 ALDH2라는 유전자가 바로 아세트알데히 드가 조금이라도 생성되면 분해하는데 관여하게 이때 유전자형에 따라서 3가지 유형으로 나 타나게 됩니다. Friday, August 12, 11
  • 4. 북서유럽 남동유럽 Friday, August 12, 11
  • 5. HGDP(Human Genome Diversity Project) Text Friday, August 12, 11
  • 6. PASNP(Pan-Asian SNP Consortium) Text Friday, August 12, 11
  • 7. East Asia - Public genotype data SNP Individual Population PASNP 54,794 1,928 75 HGDP a 2,834~ 1,056 52 HapMap 1,481,135 1,397 11 b SGVP 268,667 292 3 Korean 58,625 159 10 China(Yanbian) 58,625 16 1 Japan(Kobe) 58,625 5 1 Korea-Japan 58,625 6 1 Vietnam 58,625 16 1 Korean-Vietnam 58,625 8 1 Cambodia 58,625 16 1 Mongol 58,625 16 1 a. Pan-Asian SNP Consortium(http://www4a.biotec.or.th/PASNP) b. Singapore Genome Variation Project(http://www.nus-cme.org.sg/SGVP) Friday, August 12, 11
  • 8. Korean Data 16 YeonCheon 16 Pyeong Chang MW JeCheon 16 16 Cheonan average >70 year old long settlement Affymetrix 50K Xba GyeongJu 16 16 GimJe 15 China(Yanbian) Goryeong UlSan Japan(Kobe) 16 Korea-Japan Vietnam Korean-Vietnam SW 16 NaJu SE Cambodia Mongol 16 58,960 SNPs Jeju Friday, August 12, 11
  • 9. Missing genotype individuals GimJe GoRyeong Gyeong Text Ju Before QC 58,960 SNPs Before QC 58,960 SNPs All Asian Korean Friday, August 12, 11
  • 10. Relatedness between the 153 Korean(10 region) Individuals YeonCheon PyeongChang JeCheon CheonAn GyeongJu UlSan GimJe GoRyeong NaJu JeJu PCA analysis using autosomal 46,559 SNP markers (n=153, Korean) Friday, August 12, 11
  • 11. PCA analysis of East Asian descent Mongol Yanbian Kobe JPT- Jeju HapMap CHB- HapMap Vietnam Cambodia illustration of geographic correspondence of ethnic group Korea-Vietnam Korea-Japan locations Friday, August 12, 11
  • 12. Relationship between Eigenvector values and Latitude 47.81 39.98 37.53 2 R = 0.8621 y = 36.65 + 166.33x 14.72 Friday, August 12, 11
  • 13. STRUCTURE software • A model-based clustering method (Pritchard et al. 2000) • Free software (http://pritch.bsd.uchicago.edu/structure.html) • Bayesian approach (MCMC: Markov Chain Mote Carlo) • Detects the underlying genetic population among a set of individuals genotyped at multiple markers • Computes the proportion of the genome of an individual originating from each inferred population (quantitative clustering method) Friday, August 12, 11
  • 14. Input data • A matrix where the data for individuals are in rows, the loci are in column • n consecutive rows have the data for each individual of n- ploid species • Integer should be used for coding genotype • Missingoccur should be indicated by(e.g. -1) which doesn’t data elsewhere in the data a number • The dataSTRUCTUREbe a text file (.txt) not an excel (.xls) for running file should Friday, August 12, 11
  • 15. Input format 1 consecutive rows for alleles MarkerName... Label PopID Flag Location Genotype... genotype (1,2,5) AA = 11 AB = 12 BB = 22 missing = 55 Information of user-defined populations Lable : 각 개인의 고유한 ID로 숫자 또는 문자 어떤것이든 상관없다.(예, CEPH1334.10) PopID: 개인이 속한 민족의 고유한 번호 (예, 중국인(CHB)인 경우 5, 유럽인(CEU)인 경우 1과 같이 자신이 직접 부여) Flag: 해당 PopID 정보를 STRUCTURE 프로그램 실행시 사용할 것인가?(1= 사용한다, 2= 사용하지 않는다.) Location: 해당 개인의 위치정보(예, 동아시아(EAS)인경우 1번, 유럽(EURA)인 경우 2번과 같이 자신이 직접 부여) Friday, August 12, 11
  • 17. Running STRUCTURE from a graphical interface, Front End Friday, August 12, 11
  • 18. Importing input data into a project Friday, August 12, 11
  • 19. Importing input data into a project (cont.) Friday, August 12, 11
  • 20. Importing input data into a project (cont.) Friday, August 12, 11
  • 21. Importing input data into a project (cont.) Friday, August 12, 11
  • 22. Importing input data into a project (cont.) Friday, August 12, 11
  • 23. Importing input data into a project (cont.) Friday, August 12, 11
  • 24. Importing input data into a project (cont.) Friday, August 12, 11
  • 25. Importing input data into a project (cont.) Friday, August 12, 11
  • 26. Configuring a parameter set Friday, August 12, 11
  • 27. Configuring a parameter set (cont.) Length of Burnin Period : how long to run the simulation before collecting data to minimize the effect of the starting configuration, 목표함수로 수렴할 때까지의 반복 숫자 Number of MCMC Reps after Burnin : how long to run the simulation after burnin to get accurate parameter estimates Friday, August 12, 11
  • 28. Configuring a parameter set (cont.) Friday, August 12, 11
  • 29. Configuring a parameter set (cont.) Friday, August 12, 11
  • 30. Configuring a parameter set (cont.) Friday, August 12, 11
  • 31. Running STRUCTURE: a single run Friday, August 12, 11
  • 32. Running STRUCTURE: a single run (cont.) Friday, August 12, 11
  • 33. Running STRUCTURE: a batch run Friday, August 12, 11
  • 34. Running STRUCTURE: a batch run (cont.) Friday, August 12, 11
  • 35. Ln P(D): Estimated probability of Ks Friday, August 12, 11
  • 37. Analysis of genome-wide SNP data • For very may become impractically slow settings large data sets, the runtime of structure using default • reduced data sets (ex, pruned) • get accurate resultsNUMREPS) shorter runs than default (ex, small values of using much • download themachine) and compile it on your machine (using 64-bit source code • use the command-line version of structure Friday, August 12, 11
  • 38. An example of MCMC convergence Friday, August 12, 11
  • 39. Inference of true K (number of population) • The log likelihood for each K, Ln P(D) = L(K) • Two approaches to determine the best K • Use of L(K) : When K is approaching a true value, L(K) plateaus and has high variance between runs • Use of an ad hod quantity (∆K) the likelihood (∆K).on the second order rate of change of : calculated based The ∆K shows a clear peak at the true value of K Friday, August 12, 11
  • 41. Simulation Result Q-metrix an individuals belongs to a subpopulation Friday, August 12, 11
  • 44. We may not always be able to know the TRUE value K, but we should aim for the smallest value of K that captures the major structure in the data Pritchard et al. (2000) Friday, August 12, 11