Se ha denunciado esta presentación.
Se está descargando tu SlideShare. ×

joaks-evolution-2014

Ad

An Improved Approximate-Bayesian Method for
Estimating Shared Evolutionary History
Jamie R. Oaks1,2
1Department of Ecology...

Ad

Processes of diversification
Large-scale geological and climatic processes are important in
biodiversification and community...

Ad

Processes of diversification
Large-scale geological and climatic processes are important in
biodiversification and community...

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Ad

Cargando en…3
×

Eche un vistazo a continuación

1 de 54 Anuncio
1 de 54 Anuncio

Más Contenido Relacionado

Similares a joaks-evolution-2014 (20)

joaks-evolution-2014

  1. 1. An Improved Approximate-Bayesian Method for Estimating Shared Evolutionary History Jamie R. Oaks1,2 1Department of Ecology and Evolutionary Biology, University of Kansas 2Department of Biology, University of Washington June 21, 2014 Estimating shared history J. Oaks, University of Washington 1/24
  2. 2. Processes of diversification Large-scale geological and climatic processes are important in biodiversification and community assembly Estimating shared history J. Oaks, University of Washington 2/24
  3. 3. Processes of diversification Large-scale geological and climatic processes are important in biodiversification and community assembly Accounting for such processes will better our understanding of biodiversity Estimating shared history J. Oaks, University of Washington 2/24
  4. 4. Processes of diversification Large-scale geological and climatic processes are important in biodiversification and community assembly Accounting for such processes will better our understanding of biodiversity We need methods for inferring evolutionary patterns predicted by historical events from contemporary populations Estimating shared history J. Oaks, University of Washington 2/24
  5. 5. Community scale processes Estimating shared history J. Oaks, University of Washington 3/24
  6. 6. Community scale processes Estimating shared history J. Oaks, University of Washington 3/24
  7. 7. Community scale processes Estimating shared history J. Oaks, University of Washington 3/24
  8. 8. Community scale processes 0100200300400500 Time (kya) Estimating shared history J. Oaks, University of Washington 3/24
  9. 9. Divergence model choice T = (T1, T2, T3) model = 111 τ = {τ1} τ1 T1 T2 T3 0100200300400500 Time (kya) Estimating shared history J. Oaks, University of Washington 3/24
  10. 10. Divergence model choice T = (260, 260, 260) model = 111 τ = {260} τ1 T1 T2 T3 0100200300400500 Time (kya) Estimating shared history J. Oaks, University of Washington 3/24
  11. 11. Divergence model choice T = (397, 260, 260) model = 211 τ = {260, 397} τ1τ2 T1 T2 T3 0100200300400500 Time (kya) Estimating shared history J. Oaks, University of Washington 3/24
  12. 12. Divergence model choice T = (260, 397, 260) model = 121 τ = {260, 397} τ1τ2 T1 T2 T3 0100200300400500 Time (kya) Estimating shared history J. Oaks, University of Washington 3/24
  13. 13. Divergence model choice T = (260, 260, 397) model = 112 τ = {260, 397} τ1τ2 T1 T2 T3 0100200300400500 Time (kya) Estimating shared history J. Oaks, University of Washington 3/24
  14. 14. Divergence model choice T = (260, 95, 397) model = 123 τ = {260, 95, 397} τ1 τ3τ2 T1 T2 T3 0100200300400500 Time (kya) Estimating shared history J. Oaks, University of Washington 3/24
  15. 15. Divergence model choice T = (T1, . . . , TY) model = mi τ = {τ1, . . . , τ|τ|} τ1 T1 T2 T3 0100200300400500 Time (kya) Estimating shared history J. Oaks, University of Washington 3/24
  16. 16. Divergence model choice T = (T1, . . . , TY) model = mi τ = {τ1, . . . , τ|τ|} We want to infer m and T given DNA sequence alignments X τ1 T1 T2 T3 0100200300400500 Time (kya) Estimating shared history J. Oaks, University of Washington 3/24
  17. 17. Divergence model choice T = (T1, . . . , TY) model = mi τ = {τ1, . . . , τ|τ|} We want to infer m and T given DNA sequence alignments X τ1 0100200300400500 Time (kya) T1 T2 T3 Estimating shared history J. Oaks, University of Washington 3/24
  18. 18. Divergence model choice X Sequence alignments T Divergence times m Divergence model G Gene trees φ Substitution parameters Θ Demographic parameters We want to infer m and T given DNA sequence alignments X τ1 0100200300400500 Time (kya) T1 T2 T3 Estimating shared history J. Oaks, University of Washington 3/24
  19. 19. Bayesian model choice Full model: p(T, G, φ, Θ | X, mi ) = p(X | T, G, φ, Θ, mi )p(T, G, φ, Θ | mi ) p(X | mi ) W. Huang et al. (2011). BMC Bioinformatics 12: 1. J. R. Oaks et al. (2013). Evolution 67: 991–1010. Estimating shared history J. Oaks, University of Washington 4/24
  20. 20. Bayesian model choice Full model: p(T, G, φ, Θ | X, mi ) = p(X | T, G, φ, Θ, mi )p(T, G, φ, Θ | mi ) p(X | mi ) p(X | mi ) = θi p(X | θi , mi )p(θi | mi )dθi W. Huang et al. (2011). BMC Bioinformatics 12: 1. J. R. Oaks et al. (2013). Evolution 67: 991–1010. Estimating shared history J. Oaks, University of Washington 4/24
  21. 21. Bayesian model choice Full model: p(T, G, φ, Θ | X, mi ) = p(X | T, G, φ, Θ, mi )p(T, G, φ, Θ | mi ) p(X | mi ) p(X | mi ) = θi p(X | θi , mi )p(θi | mi )dθi p(mi | X) = p(X | mi )p(mi ) i p(X | mi )p(mi ) W. Huang et al. (2011). BMC Bioinformatics 12: 1. J. R. Oaks et al. (2013). Evolution 67: 991–1010. Estimating shared history J. Oaks, University of Washington 4/24
  22. 22. Bayesian model choice Full model: p(T, G, φ, Θ | X, mi ) = p(X | T, G, φ, Θ, mi )p(T, G, φ, Θ | mi ) p(X | mi ) p(X | mi ) = θi p(X | θi , mi )p(θi | mi )dθi p(mi | X) = p(X | mi )p(mi ) i p(X | mi )p(mi ) msBayes: Approximate Bayesian computation (ABC) W. Huang et al. (2011). BMC Bioinformatics 12: 1. J. R. Oaks et al. (2013). Evolution 67: 991–1010. Estimating shared history J. Oaks, University of Washington 4/24
  23. 23. The msBayes model msBayes will often infer clustered divergences when divergences are random over millions of generations. J. R. Oaks et al. (2013). Evolution 67: 991–1010. J. R. Oaks et al. (2014). arXiv:1402.6397 [q-bio.PE]. Estimating shared history J. Oaks, University of Washington 5/24
  24. 24. The msBayes model msBayes will often infer clustered divergences when divergences are random over millions of generations. Objective: Use principles of probability to extend msBayes framework for improved estimation of shared evolutionary history J. R. Oaks et al. (2013). Evolution 67: 991–1010. J. R. Oaks et al. (2014). arXiv:1402.6397 [q-bio.PE]. Estimating shared history J. Oaks, University of Washington 5/24
  25. 25. An improved method Potential improvements: 1. Alternative priors on parameters that increase marginal likelihoods of rich models 2. Alternative approach to modeling the temporal distribution of divergences J. R. Oaks et al. (2013). Evolution 67: 991–1010. J. R. Oaks et al. (2014). arXiv:1402.6397 [q-bio.PE]. Estimating shared history J. Oaks, University of Washington 6/24
  26. 26. p(X) = θ p(X | θ)p(θ)dθ Estimating shared history J. Oaks, University of Washington 7/24
  27. 27. p(X) = θ p(X | θ)p(θ)dθ Estimating shared history J. Oaks, University of Washington 7/24
  28. 28. p(X) = θ p(X | θ)p(θ)dθ 0.0 0.2 0.4 0.6 0.8 1.0 θ 0 5 10 15 20 25 30Density p(X| θ) p(θ) Estimating shared history J. Oaks, University of Washington 7/24
  29. 29. p(X) = θ p(X | θ)p(θ)dθ 0.0 0.2 0.4 0.6 0.8 1.0 θ 0 5 10 15 20 25 30Density p(X| θ) p(θ) Estimating shared history J. Oaks, University of Washington 7/24
  30. 30. An improved method Potential improvements: 1. Alternative priors on parameters that increase marginal likelihoods of rich models 2. Alternative approach to modeling the temporal distribution of divergences J. R. Oaks et al. (2013). Evolution 67: 991–1010. J. R. Oaks et al. (2014). arXiv:1402.6397 [q-bio.PE]. Estimating shared history J. Oaks, University of Washington 8/24
  31. 31. Prior on divergence models msBayes uses a discrete uniform prior on the number of divergence events #ofdivergencemodels 020406080100120 1 3 5 7 9 11 13 15 17 19 21 A p(M|τ|,i) 0.000.010.020.030.04 1 3 5 7 9 11 13 15 17 19 21 B # of divergence events, |τ| Estimating shared history J. Oaks, University of Washington 9/24
  32. 32. Prior on divergence models msBayes uses a discrete uniform prior on the number of divergence events #ofdivergencemodels 020406080100120 1 3 5 7 9 11 13 15 17 19 21 A p(M|τ|,i) 0.000.010.020.030.04 1 3 5 7 9 11 13 15 17 19 21 B # of divergence events, |τ| Potential solution: Place flexible prior directly on the sample space of divergence models Estimating shared history J. Oaks, University of Washington 9/24
  33. 33. New method: dpp-msbayes Replaced uniform priors on continuous parameters with gamma and beta distributions Dirichlet process prior (DPP) over all possible divergence models Estimating shared history J. Oaks, University of Washington 10/24
  34. 34. dpp-msbayes: Simulation-based assessment Simulate 50,000 datasets under three models MmsBayes U-shaped prior on divergence models Uniform priors on continuous parameters MUshaped U-shaped prior on divergence models Gamma priors on continuous parameters MDPP DPP prior on divergence models Gamma priors on continuous parameters Analyze all datasets under each of the models Estimating shared history J. Oaks, University of Washington 11/24
  35. 35. dpp-msbayes: Simulation results 0.0 0.2 0.4 0.6 0.8 1.0 MmsBayes MDPP MmsBayes 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 MDPP Posterior probability of one divergence Trueprobabilityofonedivergence Analysismodel Data model J. R. Oaks (2014). arXiv:1402.6303 [q-bio.PE]. Estimating shared history J. Oaks, University of Washington 12/24
  36. 36. dpp-msbayes: Simulation results 0.0 0.2 0.4 0.6 0.8 1.0 MmsBayes MDPP MUniform MUshaped MmsBayes 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 MDPP Posterior probability of one divergence Trueprobabilityofonedivergence Analysismodel Data model J. R. Oaks (2014). arXiv:1402.6303 [q-bio.PE]. Estimating shared history J. Oaks, University of Washington 12/24
  37. 37. dpp-msbayes: Simulation-based power analyses Simulate datasets in which all 22 divergence times are random τ ∼ U(0, 0.5 MGA) τ ∼ U(0, 1.5 MGA) τ ∼ U(0, 2.5 MGA) τ ∼ U(0, 5.0 MGA) MGA = Millions of Generations Ago Simulate 1000 datasets for each τ distribution Analyze all 4000 datasets under models MmsBayes, MUshaped , and MDPP Estimating shared history J. Oaks, University of Washington 13/24
  38. 38. dpp-msbayes: Power results 1 3 5 7 9 11 13 15 17 19 21 0.0 0.2 0.4 0.6 0.8 1.0 ¿ »U(0; 0:5 MGA) 1 3 5 7 9 11 13 15 17 19 21 ¿ »U(0; 1:5 MGA) 1 3 5 7 9 11 13 15 17 19 21 ¿ »U(0; 2:5 MGA) 1 3 5 7 9 11 13 15 17 19 21 ¿ »U(0; 5:0 MGA) MmsBayes Estimated number of divergence events (mode) Density J. R. Oaks (2014). arXiv:1402.6303 [q-bio.PE]. Estimating shared history J. Oaks, University of Washington 14/24
  39. 39. dpp-msbayes: Power results 1 3 5 7 9 11 13 15 17 19 21 0.0 0.2 0.4 0.6 0.8 1.0 ¿ »U(0; 0:5 MGA) 1 3 5 7 9 11 13 15 17 19 21 ¿ »U(0; 1:5 MGA) 1 3 5 7 9 11 13 15 17 19 21 ¿ »U(0; 2:5 MGA) 1 3 5 7 9 11 13 15 17 19 21 ¿ »U(0; 5:0 MGA) MmsBayes Estimated number of divergence events (mode) Density 1 3 5 7 9 11 13 15 17 19 21 0.0 0.2 0.4 0.6 0.8 1.0 1 3 5 7 9 11 13 15 17 19 21 1 3 5 7 9 11 13 15 17 19 21 1 3 5 7 9 11 13 15 17 19 21 MDPP Estimated number of divergence events (mode) Density J. R. Oaks (2014). arXiv:1402.6303 [q-bio.PE]. Estimating shared history J. Oaks, University of Washington 14/24
  40. 40. dpp-msbayes: Power results 0.0 0.25 0.5 0.75 1 0 2 4 6 8 10 12 14 16 ¿ »U(0; 0:5 MGA) 0.0 0.25 0.5 0.75 1 ¿ »U(0; 1:5 MGA) 0.0 0.25 0.5 0.75 1 ¿ »U(0; 2:5 MGA) 0.0 0.25 0.5 0.75 1 ¿ »U(0; 5:0 MGA) MmsBayes Posterior probability of one divergence Density 0.0 0.25 0.5 0.75 1 0 5 10 15 20 0.0 0.25 0.5 0.75 1 0.0 0.25 0.5 0.75 1 0.0 0.25 0.5 0.75 1 MDPP Posterior probability of one divergence Density J. R. Oaks (2014). arXiv:1402.6303 [q-bio.PE]. Estimating shared history J. Oaks, University of Washington 15/24
  41. 41. dpp-msbayes: Power results 0.0 0.25 0.5 0.75 1 0 2 4 6 8 10 12 14 16 ¿ »U(0; 0:5 MGA) 0.0 0.25 0.5 0.75 1 ¿ »U(0; 1:5 MGA) 0.0 0.25 0.5 0.75 1 ¿ »U(0; 2:5 MGA) 0.0 0.25 0.5 0.75 1 ¿ »U(0; 5:0 MGA) MmsBayes Posterior probability of one divergence Density 0.0 0.25 0.5 0.75 1 0 1 2 3 4 5 6 7 8 9 0.0 0.25 0.5 0.75 1 0.0 0.25 0.5 0.75 1 0.0 0.25 0.5 0.75 1 MUshaped Posterior probability of one divergence Density 0.0 0.25 0.5 0.75 1 0 5 10 15 20 0.0 0.25 0.5 0.75 1 0.0 0.25 0.5 0.75 1 0.0 0.25 0.5 0.75 1 MDPP Posterior probability of one divergence Density Estimating shared history J. Oaks, University of Washington 16/24
  42. 42. Empirical application Did fragmentation of Philippine Islands during inter-glacial rises in sea level promote diversification? Estimating shared history J. Oaks, University of Washington 17/24
  43. 43. Empirical results: Philippine diversification 1 3 5 7 9 11 13 15 17 19 21 Number of divergence events 0.0 0.1 0.2 0.3 0.4 0.5 Posteriorprobability msBayes 1 3 5 7 9 11 13 15 17 19 21 Number of divergence events dpp-msbayes J. R. Oaks (2014). arXiv:1402.6303 [q-bio.PE]. Estimating shared history J. Oaks, University of Washington 18/24
  44. 44. Conclusions New method for estimating shared evolutionary history shows improved 1. Estimation of posterior uncertainty 2. Model-choice accuracy 3. Power to detect temporal variation across divergences 4. Robustness to model violations Estimating shared history J. Oaks, University of Washington 19/24
  45. 45. Conclusions New method for estimating shared evolutionary history shows improved 1. Estimation of posterior uncertainty 2. Model-choice accuracy 3. Power to detect temporal variation across divergences 4. Robustness to model violations Caveats: Estimating a very rich (600+ parameters for 22 taxa) model using limited information from the data Likely sensitive to prior assumptions Be skeptical of strongly supported results Estimating shared history J. Oaks, University of Washington 19/24
  46. 46. Recommendations For Bayesian model choice, choose priors carefully ABC model choice estimates should be accompanied by: 1. Simulation-based power analyses 2. Assessment of prior sensitivity Estimating shared history J. Oaks, University of Washington 20/24
  47. 47. Future directions Full-likelihood Bayesian approach 1 Full-phylogenetic framework τ1 0100200300400500 Time (kya) T1 T2 T3 1 J. Sukumaran (2012). PhD thesis. Lawrence, Kansas, USA: University of Kansas Estimating shared history J. Oaks, University of Washington 21/24
  48. 48. Everything is on GitHub. . . Software: dpp-msbayes: https://github.com/joaks1/dpp-msbayes PyMsBayes: https://github.com/joaks1/PyMsBayes ABACUS: Approximate BAyesian C UtilitieS. https://github.com/joaks1/abacus Open-Science Notebook: msbayes-experiments: https://github.com/joaks1/msbayes-experiments Estimating shared history J. Oaks, University of Washington 22/24
  49. 49. Acknowledgments Ideas and feedback: Holder Lab KU Herpetology Melissa Callahan Computation: KU ITTC KU Computing Center iPlant Funding: NSF KU Grad Studies, EEB & BI SSB Sigma Xi Photo credits: Rafe Brown, Cam Siler, & Jake Esselstyn FMNH Philippine Mammal Website: D.S. Balete, M.R.M. Duya, & J. Holden PhyloPic! Estimating shared history J. Oaks, University of Washington 23/24
  50. 50. Questions? joaks1@gmail.com Estimating shared history J. Oaks, University of Washington 24/24
  51. 51. Causes of bias: Insufficient sampling Models with more parameter space are less densely sampled Could explain bias toward small models in extreme cases Predicts large variance in posterior estimates We explored empirical and simulation-based analyses with 2, 5, and 10 million prior samples, and estimates were very similar 0.0 0.2 0.4 0.6 0.8 1.0 1e8 0.0 0.2 0.4 0.6 0.8 1.0 1.2 95%HPDDT UnadjustedA 0.0 0.2 0.4 0.6 0.8 1.0 1e8 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 GLM-adjustedB Number of prior samples Estimating shared history J. Oaks, University of Washington 24/24
  52. 52. dpp-msbayes: Simulation results 1 3 5 7 9 11 13 15 17 19 21 0.0 0.2 0.4 0.6 0.8 1.0 ¿ »U(0; 0:5 MGA) 1 3 5 7 9 11 13 15 17 19 21 ¿ »U(0; 1:5 MGA) 1 3 5 7 9 11 13 15 17 19 21 ¿ »U(0; 2:5 MGA) 1 3 5 7 9 11 13 15 17 19 21 ¿ »U(0; 5:0 MGA) MmsBayes Estimated number of divergence events (mode) Density 1 3 5 7 9 11 13 15 17 19 21 0.0 0.2 0.4 0.6 0.8 1.0 1 3 5 7 9 11 13 15 17 19 21 1 3 5 7 9 11 13 15 17 19 21 1 3 5 7 9 11 13 15 17 19 21 MUshaped Estimated number of divergence events (mode) Density 1 3 5 7 9 11 13 15 17 19 21 0.0 0.2 0.4 0.6 0.8 1.0 1 3 5 7 9 11 13 15 17 19 21 1 3 5 7 9 11 13 15 17 19 21 1 3 5 7 9 11 13 15 17 19 21 MDPP Estimated number of divergence events (mode) Density Estimating shared history J. Oaks, University of Washington 24/24
  53. 53. dpp-msbayes: Simulation results 0.0 0.02 0.04 0.06 0.08 0.1 0.12 0.0 50.0 100.0 150.0 200.0 p( ^DT <0:01) =1:0 ¿ »U(0; 0:5 MGA) 0.0 0.02 0.04 0.06 0.08 0.1 0.12 0.0 50.0 100.0 150.0 200.0 p( ^DT <0:01) =0:999 ¿ »U(0; 1:5 MGA) 0.0 0.02 0.04 0.06 0.08 0.0 50.0 100.0 150.0 200.0 p( ^DT <0:01) =0:996 ¿ »U(0; 2:5 MGA) 0.0 0.02 0.04 0.06 0.08 0.1 0.12 0.0 40.0 80.0 120.0 160.0 p( ^DT <0:01) =0:637 ¿ »U(0; 5:0 MGA) MmsBayes Estimated variance in divergence times (median) Density 0.0 0.1 0.2 0.3 0.0 20.0 40.0 60.0 p( ^DT <0:01) =0:914 0.0 0.2 0.4 0.6 0.8 0.0 5.0 10.0 15.0 20.0 25.0 p( ^DT <0:01) =0:626 0.0 0.2 0.4 0.6 0.8 0.0 2.0 4.0 6.0 8.0 p( ^DT <0:01) =0:235 0.0 0.4 0.8 1.2 0.0 0.5 1.0 1.5 2.0 2.5 p( ^DT <0:01) =0:004 MUshaped Estimated variance in divergence times (median) Density 0.0 0.1 0.2 0.3 0.4 0.5 0.0 2.0 4.0 6.0 8.0 10.0 p( ^DT <0:01) =0:002 0.0 0.4 0.8 1.2 0.0 1.0 2.0 3.0 4.0 p( ^DT <0:01) =0:0 0.0 0.4 0.8 1.2 0.0 0.5 1.0 1.5 2.0 2.5 p( ^DT <0:01) =0:0 0.0 0.4 0.8 1.2 1.6 0.0 0.5 1.0 1.5 2.0 2.5 3.0 p( ^DT <0:01) =0:0 MDPP Estimated variance in divergence times (median) Density Estimating shared history J. Oaks, University of Washington 24/24
  54. 54. Empirical results: Philippine diversification 0.0 0.1 0.2 0.3 0.4 0.5 msBayes dpp-msbayes Posterior 1 3 5 7 9 11 13 15 17 19 21 0.0 0.1 0.2 0.3 0.4 0.5 1 3 5 7 9 11 13 15 17 19 21 Prior Number of divergence events Probability J. R. Oaks (2014). arXiv:1402.6303 [q-bio.PE]. Estimating shared history J. Oaks, University of Washington 24/24

×