SlideShare una empresa de Scribd logo
1 de 36
Descargar para leer sin conexión
guenomu
Software and Model
Leonardo de O. Martins
University of Vigo
May, 16th 2013
Leo Martins (U Vigo) guenomu software 2013/5/16 1 / 15
Outline
1 The Model
2 The Sampling
3 The Code
Leo Martins (U Vigo) guenomu software 2013/5/16 2 / 15
Hierarchical Bayesian model
P(S, Θ | D) ∝ P(θ0)P(λ0)P(α0)P(S) ×
×
N
i=1
P(Di | Gi , θi )P(θi | θ0)P(Gi | λi , wi , S)P(λi | λ0)P(wi | αi )P(αi | α0)
Leo Martins (U Vigo) guenomu software 2013/5/16 3 / 15
The mixture of distance distributions
P(G | λ, w, S) =
w1e−(dDUPS (G,S)/λDUPS +dLOSS (G,S)/λLOSS ) + w2e−(dILS (G,S)/λILS ) + w3e−(dRF (G,S)/λRF )
Z(λ, w, S)
Leo Martins (U Vigo) guenomu software 2013/5/16 4 / 15
The mixture of distance distributions
P(G | λ, w, S) =
w1e−(dDUPS (G,S)/λDUPS +dLOSS (G,S)/λLOSS ) + w2e−(dILS (G,S)/λILS ) + w3e−(dRF (G,S)/λRF )
Z(λ, w, S)
wi ∼ Gamma(αgene , 1)
Leo Martins (U Vigo) guenomu software 2013/5/16 4 / 15
The mixture of distance distributions
P(G | λ, w, S) =
w1e−(dDUPS (G,S)/λDUPS +dLOSS (G,S)/λLOSS ) + w2e−(dILS (G,S)/λILS ) + w3e−(dRF (G,S)/λRF )
Z(λ, w, S)
wi ∼ Gamma(αgene , 1)
λx ∼ Exp(Λx )
Leo Martins (U Vigo) guenomu software 2013/5/16 4 / 15
The mixture of distance distributions
P(G | λ, w, S) =
w1e−(dDUPS (G,S)/λDUPS +dLOSS (G,S)/λLOSS ) + w2e−(dILS (G,S)/λILS ) + w3e−(dRF (G,S)/λRF )
Z(λ, w, S)
wi ∼ Gamma(αgene , 1)
λx ∼ Exp(Λx )
each gene has its own set of wi and λi
Leo Martins (U Vigo) guenomu software 2013/5/16 4 / 15
The mixture of distance distributions
P(G | λ, w, S) =
w1e−(dDUPS (G,S)/λDUPS +dLOSS (G,S)/λLOSS ) + w2e−(dILS (G,S)/λILS ) + w3e−(dRF (G,S)/λRF )
Z(λ, w, S)
wi ∼ Gamma(αgene , 1)
λx ∼ Exp(Λx )
each gene has its own set of wi and λi
the distances dx (G, S) are scaled to account for different gene family sizes
Leo Martins (U Vigo) guenomu software 2013/5/16 4 / 15
Outline
1 The Model
2 The Sampling
3 The Code
Leo Martins (U Vigo) guenomu software 2013/5/16 5 / 15
Doubly-intractable distributions
π(y | θ) =
qθ(y)
Z(θ)
=
eθt
s(y)
Z(θ)
; Z(θ) =
y
eθt
s(y)
(1)
augmented distribution: π(θ , y , θ | y) ∝ π(y | θ)π(θ)h(θ | θ)π(y | θ )
Leo Martins (U Vigo) guenomu software 2013/5/16 6 / 15
Doubly-intractable distributions
π(y | θ) =
qθ(y)
Z(θ)
=
eθt
s(y)
Z(θ)
; Z(θ) =
y
eθt
s(y)
(1)
augmented distribution: π(θ , y , θ | y) ∝ π(y | θ)π(θ)h(θ | θ)π(y | θ )
Gibbs update of the auxiliary variables θ ,y :
Leo Martins (U Vigo) guenomu software 2013/5/16 6 / 15
Doubly-intractable distributions
π(y | θ) =
qθ(y)
Z(θ)
=
eθt
s(y)
Z(θ)
; Z(θ) =
y
eθt
s(y)
(1)
augmented distribution: π(θ , y , θ | y) ∝ π(y | θ)π(θ)h(θ | θ)π(y | θ )
Gibbs update of the auxiliary variables θ ,y :
I. draw θ ∼ h(· | θ)
Leo Martins (U Vigo) guenomu software 2013/5/16 6 / 15
Doubly-intractable distributions
π(y | θ) =
qθ(y)
Z(θ)
=
eθt
s(y)
Z(θ)
; Z(θ) =
y
eθt
s(y)
(1)
augmented distribution: π(θ , y , θ | y) ∝ π(y | θ)π(θ)h(θ | θ)π(y | θ )
Gibbs update of the auxiliary variables θ ,y :
I. draw θ ∼ h(· | θ)
II. draw y ∼ π(· | θ )
Leo Martins (U Vigo) guenomu software 2013/5/16 6 / 15
Doubly-intractable distributions
π(y | θ) =
qθ(y)
Z(θ)
=
eθt
s(y)
Z(θ)
; Z(θ) =
y
eθt
s(y)
(1)
augmented distribution: π(θ , y , θ | y) ∝ π(y | θ)π(θ)h(θ | θ)π(y | θ )
Gibbs update of the auxiliary variables θ ,y :
I. draw θ ∼ h(· | θ)
II. draw y ∼ π(· | θ )
exchange ratio from θ to θ
min 1,
qθ(y )π(θ )h(θ | θ )qθ (y)
qθ(y)π(θ)h(θ | θ)qθ (y )
(2)
Leo Martins (U Vigo) guenomu software 2013/5/16 6 / 15
Doubly-intractable distributions
π(y | θ) =
qθ(y)
Z(θ)
=
eθt
s(y)
Z(θ)
; Z(θ) =
y
eθt
s(y)
(1)
augmented distribution: π(θ , y , θ | y) ∝ π(y | θ)π(θ)h(θ | θ)π(y | θ )
Gibbs update of the auxiliary variables θ ,y :
I. draw θ ∼ h(· | θ)
II. draw y ∼ π(· | θ )
exchange ratio from θ to θ
min 1,
qθ(y )π(θ )h(θ | θ )qθ (y)
qθ(y)π(θ)h(θ | θ)qθ (y )
(2)
We draw y (the gene tree) through a secondary MCMC starting at its
current value
Leo Martins (U Vigo) guenomu software 2013/5/16 6 / 15
Species tree proposal with the exchange algorithm
Leo Martins (U Vigo) guenomu software 2013/5/16 7 / 15
Species tree proposal with the exchange algorithm
Leo Martins (U Vigo) guenomu software 2013/5/16 7 / 15
Species tree proposal with the exchange algorithm
Leo Martins (U Vigo) guenomu software 2013/5/16 7 / 15
Species tree proposal with the exchange algorithm
Leo Martins (U Vigo) guenomu software 2013/5/16 7 / 15
Generalized Multiple-Try Metropolis
MH: sample y, decide if accept it with probability r
r =
π(y)
π(x)
q(y, x)
q(x, y)
=
π(y)
π(x)
p(x | y)
p(y | x)
Leo Martins (U Vigo) guenomu software 2013/5/16 8 / 15
Generalized Multiple-Try Metropolis
MH: sample y, decide if accept it with probability r
r =
π(y)
π(x)
q(y, x)
q(x, y)
=
π(y)
π(x)
p(x | y)
p(y | x)
MTM: choose y among several samples, according to their relative weights
r =
w(y1, x) + · · · + w(yk , x)
w(x∗
1 , y) + · · · + w(x∗
k , y)
where w(x, y) = π(x)q(x, y)λ(x, y) = π(x)p(y | x)λ(x, y)
Leo Martins (U Vigo) guenomu software 2013/5/16 8 / 15
Generalized Multiple-Try Metropolis
MH: sample y, decide if accept it with probability r
r =
π(y)
π(x)
q(y, x)
q(x, y)
=
π(y)
π(x)
p(x | y)
p(y | x)
MTM: choose y among several samples, according to their relative weights
r =
w(y1, x) + · · · + w(yk , x)
w(x∗
1 , y) + · · · + w(x∗
k , y)
where w(x, y) = π(x)q(x, y)λ(x, y) = π(x)p(y | x)λ(x, y)
GMTM: weights w(.) do not need to represent probability distributions.
r =
π(y)pk (x | y)
π(x)pk (y | x)
Wx
Wy
where Wy = wi (yi ,x)
k
j=1 wj (yj ,x)
for the chosen element i
Leo Martins (U Vigo) guenomu software 2013/5/16 8 / 15
gene tree proposal with GMTM or MTM
Leo Martins (U Vigo) guenomu software 2013/5/16 9 / 15
gene tree proposal with GMTM or MTM
Leo Martins (U Vigo) guenomu software 2013/5/16 9 / 15
gene tree proposal with GMTM or MTM
Leo Martins (U Vigo) guenomu software 2013/5/16 9 / 15
Outline
1 The Model
2 The Sampling
3 The Code
Leo Martins (U Vigo) guenomu software 2013/5/16 10 / 15
RF distance, Assignment cost (Hdist)
Leo Martins (U Vigo) guenomu software 2013/5/16 11 / 15
RF distance, Assignment cost (Hdist)
Leo Martins (U Vigo) guenomu software 2013/5/16 11 / 15
A parallel pseudo-random number generator (PRNG)
Given a seed and an algorithm, we have a stream of PRNs.
PRNG1
PRNG2
PRNG2
PRNG2
PRNG2
x1
seed
x2
x3
x4
x11 x12
Leo Martins (U Vigo) guenomu software 2013/5/16 12 / 15
A parallel pseudo-random number generator (PRNG)
Given a seed and an algorithm, we have a stream of PRNs.
PRNG1
PRNG2
PRNG2
PRNG2
PRNG2
x1
seed
x2
x3
x4
x11 x12
Using a second algorithm, the first
stream will give us a sequence of
seeds. We use the 150 parameter
sets for the Tausworthe (LFSR)
generators (L’ecuyer, Maths Comput
1999, pp.261).
Therefore, given the seed, we can
predict all states of all streams.
Leo Martins (U Vigo) guenomu software 2013/5/16 12 / 15
A parallel pseudo-random number generator (PRNG)
In our gene/species model:
PRNG1
PRNG2
PRNG2
PRNG2
PRNG2
x1
seed
x2
x3
x4
x11 x12
we split gene families among jobs
Leo Martins (U Vigo) guenomu software 2013/5/16 13 / 15
A parallel pseudo-random number generator (PRNG)
In our gene/species model:
PRNG1
PRNG2
PRNG2
PRNG2
PRNG2
x1
seed
x2
x3
x4
x11 x12
we split gene families among jobs
all jobs receive seed (broadcast)
and therefore can reproduce the
same x1. That’s cheaper than
communicating the states.
Leo Martins (U Vigo) guenomu software 2013/5/16 13 / 15
A parallel pseudo-random number generator (PRNG)
In our gene/species model:
PRNG1
PRNG2
PRNG2
PRNG2
PRNG2
x1
seed
x2
x3
x4
x11 x12
we split gene families among jobs
all jobs receive seed (broadcast)
and therefore can reproduce the
same x1. That’s cheaper than
communicating the states.
each job uses its own x(i+1) for
sampling new gene trees etc. and
can work in parallel. They use the
common x1 for sampling e.g. new
species tree, which needs
synchronization.
Leo Martins (U Vigo) guenomu software 2013/5/16 13 / 15
A parallel pseudo-random number generator (PRNG)
In our gene/species model:
PRNG1
PRNG2
PRNG2
PRNG2
PRNG2
x1
seed
x2
x3
x4
x11 x12
we split gene families among jobs
all jobs receive seed (broadcast)
and therefore can reproduce the
same x1. That’s cheaper than
communicating the states.
each job uses its own x(i+1) for
sampling new gene trees etc. and
can work in parallel. They use the
common x1 for sampling e.g. new
species tree, which needs
synchronization.
the only thing that must be shared
is thus the proposal values
(AllReduce) when updating
”global” parameters”, so that all
jobs can make the same
acceptance/rejection decision.
Leo Martins (U Vigo) guenomu software 2013/5/16 13 / 15
Each job looks like an independent analysis
Leo Martins (U Vigo) guenomu software 2013/5/16 14 / 15
https://bitbucket.org/leomrtns/guenomu
Leo Martins (U Vigo) guenomu software 2013/5/16 15 / 15

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

New Mathematical Tools for the Financial Sector
New Mathematical Tools for the Financial SectorNew Mathematical Tools for the Financial Sector
New Mathematical Tools for the Financial Sector
 
Logit stick-breaking priors for partially exchangeable count data
Logit stick-breaking priors for partially exchangeable count dataLogit stick-breaking priors for partially exchangeable count data
Logit stick-breaking priors for partially exchangeable count data
 
Indirect effects affects ecosystem dynamics
Indirect effects affects ecosystem dynamicsIndirect effects affects ecosystem dynamics
Indirect effects affects ecosystem dynamics
 
MAPE regression, seminar @ QUT (Brisbane)
MAPE regression, seminar @ QUT (Brisbane)MAPE regression, seminar @ QUT (Brisbane)
MAPE regression, seminar @ QUT (Brisbane)
 
Discussion of ABC talk by Stefano Cabras, Padova, March 21, 2013
Discussion of ABC talk by Stefano Cabras, Padova, March 21, 2013Discussion of ABC talk by Stefano Cabras, Padova, March 21, 2013
Discussion of ABC talk by Stefano Cabras, Padova, March 21, 2013
 
Ece4510 notes03
Ece4510 notes03Ece4510 notes03
Ece4510 notes03
 
Uncertainty in deep learning
Uncertainty in deep learningUncertainty in deep learning
Uncertainty in deep learning
 
k-MLE: A fast algorithm for learning statistical mixture models
k-MLE: A fast algorithm for learning statistical mixture modelsk-MLE: A fast algorithm for learning statistical mixture models
k-MLE: A fast algorithm for learning statistical mixture models
 
Approximate Bayesian model choice via random forests
Approximate Bayesian model choice via random forestsApproximate Bayesian model choice via random forests
Approximate Bayesian model choice via random forests
 
Adaptive Three Operator Splitting
Adaptive Three Operator SplittingAdaptive Three Operator Splitting
Adaptive Three Operator Splitting
 
prior selection for mixture estimation
prior selection for mixture estimationprior selection for mixture estimation
prior selection for mixture estimation
 
Convergence of ABC methods
Convergence of ABC methodsConvergence of ABC methods
Convergence of ABC methods
 
Darmon Points: an Overview
Darmon Points: an OverviewDarmon Points: an Overview
Darmon Points: an Overview
 
better together? statistical learning in models made of modules
better together? statistical learning in models made of modulesbetter together? statistical learning in models made of modules
better together? statistical learning in models made of modules
 
Icros2021 handout
Icros2021 handoutIcros2021 handout
Icros2021 handout
 
Need for Controllers having Integer Coefficients in Homomorphically Encrypted D...
Need for Controllers having Integer Coefficients in Homomorphically Encrypted D...Need for Controllers having Integer Coefficients in Homomorphically Encrypted D...
Need for Controllers having Integer Coefficients in Homomorphically Encrypted D...
 
Fougeres Besancon Archimax
Fougeres Besancon ArchimaxFougeres Besancon Archimax
Fougeres Besancon Archimax
 
Approximate Bayesian Computation on GPUs
Approximate Bayesian Computation on GPUsApproximate Bayesian Computation on GPUs
Approximate Bayesian Computation on GPUs
 
Mixture Models for Image Analysis
Mixture Models for Image AnalysisMixture Models for Image Analysis
Mixture Models for Image Analysis
 
Slides: Total Jensen divergences: Definition, Properties and k-Means++ Cluste...
Slides: Total Jensen divergences: Definition, Properties and k-Means++ Cluste...Slides: Total Jensen divergences: Definition, Properties and k-Means++ Cluste...
Slides: Total Jensen divergences: Definition, Properties and k-Means++ Cluste...
 

Similar a guenomu software -- model and agorithm in 2013

lec2_CS540_handouts.pdf
lec2_CS540_handouts.pdflec2_CS540_handouts.pdf
lec2_CS540_handouts.pdf
ZineddineALICHE1
 
Martin Roth: A spatial peaks-over-threshold model in a nonstationary climate
Martin Roth: A spatial peaks-over-threshold model in a nonstationary climateMartin Roth: A spatial peaks-over-threshold model in a nonstationary climate
Martin Roth: A spatial peaks-over-threshold model in a nonstationary climate
Jiří Šmída
 

Similar a guenomu software -- model and agorithm in 2013 (20)

PMED Transition Workshop - A Bayesian Model for Joint Longitudinal and Surviv...
PMED Transition Workshop - A Bayesian Model for Joint Longitudinal and Surviv...PMED Transition Workshop - A Bayesian Model for Joint Longitudinal and Surviv...
PMED Transition Workshop - A Bayesian Model for Joint Longitudinal and Surviv...
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...
 
Accelerating Pseudo-Marginal MCMC using Gaussian Processes
Accelerating Pseudo-Marginal MCMC using Gaussian ProcessesAccelerating Pseudo-Marginal MCMC using Gaussian Processes
Accelerating Pseudo-Marginal MCMC using Gaussian Processes
 
CLIM Transition Workshop - Semiparametric Models for Extremes - Surya Tokdar,...
CLIM Transition Workshop - Semiparametric Models for Extremes - Surya Tokdar,...CLIM Transition Workshop - Semiparametric Models for Extremes - Surya Tokdar,...
CLIM Transition Workshop - Semiparametric Models for Extremes - Surya Tokdar,...
 
Complexity Classes and the Graph Isomorphism Problem
Complexity Classes and the Graph Isomorphism ProblemComplexity Classes and the Graph Isomorphism Problem
Complexity Classes and the Graph Isomorphism Problem
 
Side 2019 #7
Side 2019 #7Side 2019 #7
Side 2019 #7
 
SOLVING BVPs OF SINGULARLY PERTURBED DISCRETE SYSTEMS
SOLVING BVPs OF SINGULARLY PERTURBED DISCRETE SYSTEMSSOLVING BVPs OF SINGULARLY PERTURBED DISCRETE SYSTEMS
SOLVING BVPs OF SINGULARLY PERTURBED DISCRETE SYSTEMS
 
MUMS Undergraduate Workshop - A Biased Introduction to Global Sensitivity Ana...
MUMS Undergraduate Workshop - A Biased Introduction to Global Sensitivity Ana...MUMS Undergraduate Workshop - A Biased Introduction to Global Sensitivity Ana...
MUMS Undergraduate Workshop - A Biased Introduction to Global Sensitivity Ana...
 
ppt_tech
ppt_techppt_tech
ppt_tech
 
Bayesian Deep Learning
Bayesian Deep LearningBayesian Deep Learning
Bayesian Deep Learning
 
lec2_CS540_handouts.pdf
lec2_CS540_handouts.pdflec2_CS540_handouts.pdf
lec2_CS540_handouts.pdf
 
Bayesian computation with INLA
Bayesian computation with INLABayesian computation with INLA
Bayesian computation with INLA
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
CLIM Fall 2017 Course: Statistics for Climate Research, Statistics of Climate...
CLIM Fall 2017 Course: Statistics for Climate Research, Statistics of Climate...CLIM Fall 2017 Course: Statistics for Climate Research, Statistics of Climate...
CLIM Fall 2017 Course: Statistics for Climate Research, Statistics of Climate...
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
QMC: Operator Splitting Workshop, Perturbed (accelerated) Proximal-Gradient A...
QMC: Operator Splitting Workshop, Perturbed (accelerated) Proximal-Gradient A...QMC: Operator Splitting Workshop, Perturbed (accelerated) Proximal-Gradient A...
QMC: Operator Splitting Workshop, Perturbed (accelerated) Proximal-Gradient A...
 
Martin Roth: A spatial peaks-over-threshold model in a nonstationary climate
Martin Roth: A spatial peaks-over-threshold model in a nonstationary climateMartin Roth: A spatial peaks-over-threshold model in a nonstationary climate
Martin Roth: A spatial peaks-over-threshold model in a nonstationary climate
 
NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015
 
New Insights and Perspectives on the Natural Gradient Method
New Insights and Perspectives on the Natural Gradient MethodNew Insights and Perspectives on the Natural Gradient Method
New Insights and Perspectives on the Natural Gradient Method
 
Identifiability in Dynamic Casual Networks
Identifiability in Dynamic Casual NetworksIdentifiability in Dynamic Casual Networks
Identifiability in Dynamic Casual Networks
 

Último

Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
RohitNehra6
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Sérgio Sacani
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
Sérgio Sacani
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
Sérgio Sacani
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
LeenakshiTyagi
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
anilsa9823
 

Último (20)

9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
TEST BANK For Radiologic Science for Technologists, 12th Edition by Stewart C...
 
Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 

guenomu software -- model and agorithm in 2013

  • 1. guenomu Software and Model Leonardo de O. Martins University of Vigo May, 16th 2013 Leo Martins (U Vigo) guenomu software 2013/5/16 1 / 15
  • 2. Outline 1 The Model 2 The Sampling 3 The Code Leo Martins (U Vigo) guenomu software 2013/5/16 2 / 15
  • 3. Hierarchical Bayesian model P(S, Θ | D) ∝ P(θ0)P(λ0)P(α0)P(S) × × N i=1 P(Di | Gi , θi )P(θi | θ0)P(Gi | λi , wi , S)P(λi | λ0)P(wi | αi )P(αi | α0) Leo Martins (U Vigo) guenomu software 2013/5/16 3 / 15
  • 4. The mixture of distance distributions P(G | λ, w, S) = w1e−(dDUPS (G,S)/λDUPS +dLOSS (G,S)/λLOSS ) + w2e−(dILS (G,S)/λILS ) + w3e−(dRF (G,S)/λRF ) Z(λ, w, S) Leo Martins (U Vigo) guenomu software 2013/5/16 4 / 15
  • 5. The mixture of distance distributions P(G | λ, w, S) = w1e−(dDUPS (G,S)/λDUPS +dLOSS (G,S)/λLOSS ) + w2e−(dILS (G,S)/λILS ) + w3e−(dRF (G,S)/λRF ) Z(λ, w, S) wi ∼ Gamma(αgene , 1) Leo Martins (U Vigo) guenomu software 2013/5/16 4 / 15
  • 6. The mixture of distance distributions P(G | λ, w, S) = w1e−(dDUPS (G,S)/λDUPS +dLOSS (G,S)/λLOSS ) + w2e−(dILS (G,S)/λILS ) + w3e−(dRF (G,S)/λRF ) Z(λ, w, S) wi ∼ Gamma(αgene , 1) λx ∼ Exp(Λx ) Leo Martins (U Vigo) guenomu software 2013/5/16 4 / 15
  • 7. The mixture of distance distributions P(G | λ, w, S) = w1e−(dDUPS (G,S)/λDUPS +dLOSS (G,S)/λLOSS ) + w2e−(dILS (G,S)/λILS ) + w3e−(dRF (G,S)/λRF ) Z(λ, w, S) wi ∼ Gamma(αgene , 1) λx ∼ Exp(Λx ) each gene has its own set of wi and λi Leo Martins (U Vigo) guenomu software 2013/5/16 4 / 15
  • 8. The mixture of distance distributions P(G | λ, w, S) = w1e−(dDUPS (G,S)/λDUPS +dLOSS (G,S)/λLOSS ) + w2e−(dILS (G,S)/λILS ) + w3e−(dRF (G,S)/λRF ) Z(λ, w, S) wi ∼ Gamma(αgene , 1) λx ∼ Exp(Λx ) each gene has its own set of wi and λi the distances dx (G, S) are scaled to account for different gene family sizes Leo Martins (U Vigo) guenomu software 2013/5/16 4 / 15
  • 9. Outline 1 The Model 2 The Sampling 3 The Code Leo Martins (U Vigo) guenomu software 2013/5/16 5 / 15
  • 10. Doubly-intractable distributions π(y | θ) = qθ(y) Z(θ) = eθt s(y) Z(θ) ; Z(θ) = y eθt s(y) (1) augmented distribution: π(θ , y , θ | y) ∝ π(y | θ)π(θ)h(θ | θ)π(y | θ ) Leo Martins (U Vigo) guenomu software 2013/5/16 6 / 15
  • 11. Doubly-intractable distributions π(y | θ) = qθ(y) Z(θ) = eθt s(y) Z(θ) ; Z(θ) = y eθt s(y) (1) augmented distribution: π(θ , y , θ | y) ∝ π(y | θ)π(θ)h(θ | θ)π(y | θ ) Gibbs update of the auxiliary variables θ ,y : Leo Martins (U Vigo) guenomu software 2013/5/16 6 / 15
  • 12. Doubly-intractable distributions π(y | θ) = qθ(y) Z(θ) = eθt s(y) Z(θ) ; Z(θ) = y eθt s(y) (1) augmented distribution: π(θ , y , θ | y) ∝ π(y | θ)π(θ)h(θ | θ)π(y | θ ) Gibbs update of the auxiliary variables θ ,y : I. draw θ ∼ h(· | θ) Leo Martins (U Vigo) guenomu software 2013/5/16 6 / 15
  • 13. Doubly-intractable distributions π(y | θ) = qθ(y) Z(θ) = eθt s(y) Z(θ) ; Z(θ) = y eθt s(y) (1) augmented distribution: π(θ , y , θ | y) ∝ π(y | θ)π(θ)h(θ | θ)π(y | θ ) Gibbs update of the auxiliary variables θ ,y : I. draw θ ∼ h(· | θ) II. draw y ∼ π(· | θ ) Leo Martins (U Vigo) guenomu software 2013/5/16 6 / 15
  • 14. Doubly-intractable distributions π(y | θ) = qθ(y) Z(θ) = eθt s(y) Z(θ) ; Z(θ) = y eθt s(y) (1) augmented distribution: π(θ , y , θ | y) ∝ π(y | θ)π(θ)h(θ | θ)π(y | θ ) Gibbs update of the auxiliary variables θ ,y : I. draw θ ∼ h(· | θ) II. draw y ∼ π(· | θ ) exchange ratio from θ to θ min 1, qθ(y )π(θ )h(θ | θ )qθ (y) qθ(y)π(θ)h(θ | θ)qθ (y ) (2) Leo Martins (U Vigo) guenomu software 2013/5/16 6 / 15
  • 15. Doubly-intractable distributions π(y | θ) = qθ(y) Z(θ) = eθt s(y) Z(θ) ; Z(θ) = y eθt s(y) (1) augmented distribution: π(θ , y , θ | y) ∝ π(y | θ)π(θ)h(θ | θ)π(y | θ ) Gibbs update of the auxiliary variables θ ,y : I. draw θ ∼ h(· | θ) II. draw y ∼ π(· | θ ) exchange ratio from θ to θ min 1, qθ(y )π(θ )h(θ | θ )qθ (y) qθ(y)π(θ)h(θ | θ)qθ (y ) (2) We draw y (the gene tree) through a secondary MCMC starting at its current value Leo Martins (U Vigo) guenomu software 2013/5/16 6 / 15
  • 16. Species tree proposal with the exchange algorithm Leo Martins (U Vigo) guenomu software 2013/5/16 7 / 15
  • 17. Species tree proposal with the exchange algorithm Leo Martins (U Vigo) guenomu software 2013/5/16 7 / 15
  • 18. Species tree proposal with the exchange algorithm Leo Martins (U Vigo) guenomu software 2013/5/16 7 / 15
  • 19. Species tree proposal with the exchange algorithm Leo Martins (U Vigo) guenomu software 2013/5/16 7 / 15
  • 20. Generalized Multiple-Try Metropolis MH: sample y, decide if accept it with probability r r = π(y) π(x) q(y, x) q(x, y) = π(y) π(x) p(x | y) p(y | x) Leo Martins (U Vigo) guenomu software 2013/5/16 8 / 15
  • 21. Generalized Multiple-Try Metropolis MH: sample y, decide if accept it with probability r r = π(y) π(x) q(y, x) q(x, y) = π(y) π(x) p(x | y) p(y | x) MTM: choose y among several samples, according to their relative weights r = w(y1, x) + · · · + w(yk , x) w(x∗ 1 , y) + · · · + w(x∗ k , y) where w(x, y) = π(x)q(x, y)λ(x, y) = π(x)p(y | x)λ(x, y) Leo Martins (U Vigo) guenomu software 2013/5/16 8 / 15
  • 22. Generalized Multiple-Try Metropolis MH: sample y, decide if accept it with probability r r = π(y) π(x) q(y, x) q(x, y) = π(y) π(x) p(x | y) p(y | x) MTM: choose y among several samples, according to their relative weights r = w(y1, x) + · · · + w(yk , x) w(x∗ 1 , y) + · · · + w(x∗ k , y) where w(x, y) = π(x)q(x, y)λ(x, y) = π(x)p(y | x)λ(x, y) GMTM: weights w(.) do not need to represent probability distributions. r = π(y)pk (x | y) π(x)pk (y | x) Wx Wy where Wy = wi (yi ,x) k j=1 wj (yj ,x) for the chosen element i Leo Martins (U Vigo) guenomu software 2013/5/16 8 / 15
  • 23. gene tree proposal with GMTM or MTM Leo Martins (U Vigo) guenomu software 2013/5/16 9 / 15
  • 24. gene tree proposal with GMTM or MTM Leo Martins (U Vigo) guenomu software 2013/5/16 9 / 15
  • 25. gene tree proposal with GMTM or MTM Leo Martins (U Vigo) guenomu software 2013/5/16 9 / 15
  • 26. Outline 1 The Model 2 The Sampling 3 The Code Leo Martins (U Vigo) guenomu software 2013/5/16 10 / 15
  • 27. RF distance, Assignment cost (Hdist) Leo Martins (U Vigo) guenomu software 2013/5/16 11 / 15
  • 28. RF distance, Assignment cost (Hdist) Leo Martins (U Vigo) guenomu software 2013/5/16 11 / 15
  • 29. A parallel pseudo-random number generator (PRNG) Given a seed and an algorithm, we have a stream of PRNs. PRNG1 PRNG2 PRNG2 PRNG2 PRNG2 x1 seed x2 x3 x4 x11 x12 Leo Martins (U Vigo) guenomu software 2013/5/16 12 / 15
  • 30. A parallel pseudo-random number generator (PRNG) Given a seed and an algorithm, we have a stream of PRNs. PRNG1 PRNG2 PRNG2 PRNG2 PRNG2 x1 seed x2 x3 x4 x11 x12 Using a second algorithm, the first stream will give us a sequence of seeds. We use the 150 parameter sets for the Tausworthe (LFSR) generators (L’ecuyer, Maths Comput 1999, pp.261). Therefore, given the seed, we can predict all states of all streams. Leo Martins (U Vigo) guenomu software 2013/5/16 12 / 15
  • 31. A parallel pseudo-random number generator (PRNG) In our gene/species model: PRNG1 PRNG2 PRNG2 PRNG2 PRNG2 x1 seed x2 x3 x4 x11 x12 we split gene families among jobs Leo Martins (U Vigo) guenomu software 2013/5/16 13 / 15
  • 32. A parallel pseudo-random number generator (PRNG) In our gene/species model: PRNG1 PRNG2 PRNG2 PRNG2 PRNG2 x1 seed x2 x3 x4 x11 x12 we split gene families among jobs all jobs receive seed (broadcast) and therefore can reproduce the same x1. That’s cheaper than communicating the states. Leo Martins (U Vigo) guenomu software 2013/5/16 13 / 15
  • 33. A parallel pseudo-random number generator (PRNG) In our gene/species model: PRNG1 PRNG2 PRNG2 PRNG2 PRNG2 x1 seed x2 x3 x4 x11 x12 we split gene families among jobs all jobs receive seed (broadcast) and therefore can reproduce the same x1. That’s cheaper than communicating the states. each job uses its own x(i+1) for sampling new gene trees etc. and can work in parallel. They use the common x1 for sampling e.g. new species tree, which needs synchronization. Leo Martins (U Vigo) guenomu software 2013/5/16 13 / 15
  • 34. A parallel pseudo-random number generator (PRNG) In our gene/species model: PRNG1 PRNG2 PRNG2 PRNG2 PRNG2 x1 seed x2 x3 x4 x11 x12 we split gene families among jobs all jobs receive seed (broadcast) and therefore can reproduce the same x1. That’s cheaper than communicating the states. each job uses its own x(i+1) for sampling new gene trees etc. and can work in parallel. They use the common x1 for sampling e.g. new species tree, which needs synchronization. the only thing that must be shared is thus the proposal values (AllReduce) when updating ”global” parameters”, so that all jobs can make the same acceptance/rejection decision. Leo Martins (U Vigo) guenomu software 2013/5/16 13 / 15
  • 35. Each job looks like an independent analysis Leo Martins (U Vigo) guenomu software 2013/5/16 14 / 15
  • 36. https://bitbucket.org/leomrtns/guenomu Leo Martins (U Vigo) guenomu software 2013/5/16 15 / 15