guenomu software -- model and agorithm in 2013

guenomu
Software and Model
Leonardo de O. Martins
University of Vigo
May, 16th 2013
Leo Martins (U Vigo) guenomu software 2013/5/16 1 / 15

Outline
1 The Model
2 The Sampling
3 The Code

The mixture of distance distributions
P(G | λ, w, S) =
w1e−(dDUPS (G,S)/λDUPS +dLOSS (G,S)/λLOSS ) + w2e−(dILS (G,S)/λILS ) + w3e−(dRF (G,S)/λRF )
Z(λ, w, S)

P(G | λ, w, S) =
Z(λ, w, S)
wi ∼ Gamma(αgene , 1)

P(G | λ, w, S) =
Z(λ, w, S)
λx ∼ Exp(Λx )

P(G | λ, w, S) =
Z(λ, w, S)
λx ∼ Exp(Λx )
each gene has its own set of wi and λi

P(G | λ, w, S) =
Z(λ, w, S)
λx ∼ Exp(Λx )
each gene has its own set of wi and λi
the distances dx (G, S) are scaled to account for diﬀerent gene family sizes

Outline
1 The Model
2 The Sampling
3 The Code

Doubly-intractable distributions
π(y | θ) =
qθ(y)
Z(θ)
=
eθt
s(y)
Z(θ)
; Z(θ) =
y
eθt
s(y)
(1)
augmented distribution: π(θ , y , θ | y) ∝ π(y | θ)π(θ)h(θ | θ)π(y | θ )

π(y | θ) =
qθ(y)
Z(θ)
=
eθt
s(y)
Z(θ)
; Z(θ) =
y
eθt
s(y)
(1)
Gibbs update of the auxiliary variables θ ,y :

π(y | θ) =
qθ(y)
Z(θ)
=
eθt
s(y)
Z(θ)
; Z(θ) =
y
eθt
s(y)
(1)
I. draw θ ∼ h(· | θ)

π(y | θ) =
qθ(y)
Z(θ)
=
eθt
s(y)
Z(θ)
; Z(θ) =
y
eθt
s(y)
(1)
II. draw y ∼ π(· | θ )

π(y | θ) =
qθ(y)
Z(θ)
=
eθt
s(y)
Z(θ)
; Z(θ) =
y
eθt
s(y)
(1)
exchange ratio from θ to θ
min 1,
qθ(y )π(θ )h(θ | θ )qθ (y)
qθ(y)π(θ)h(θ | θ)qθ (y )
(2)

π(y | θ) =
qθ(y)
Z(θ)
=
eθt
s(y)
Z(θ)
; Z(θ) =
y
eθt
s(y)
(1)
exchange ratio from θ to θ
min 1,
qθ(y )π(θ )h(θ | θ )qθ (y)
qθ(y)π(θ)h(θ | θ)qθ (y )
(2)
We draw y (the gene tree) through a secondary MCMC starting at its
current value

Species tree proposal with the exchange algorithm

Generalized Multiple-Try Metropolis
MH: sample y, decide if accept it with probability r
r =
π(y)
π(x)
q(y, x)
q(x, y)
=
π(y)
π(x)
p(x | y)
p(y | x)

r =
π(y)
π(x)
q(y, x)
q(x, y)
=
π(y)
π(x)
p(x | y)
p(y | x)
MTM: choose y among several samples, according to their relative weights
r =
w(y1, x) + · · · + w(yk , x)
w(x∗
1 , y) + · · · + w(x∗
k , y)
where w(x, y) = π(x)q(x, y)λ(x, y) = π(x)p(y | x)λ(x, y)

r =
π(y)
π(x)
q(y, x)
q(x, y)
=
π(y)
π(x)
p(x | y)
p(y | x)
MTM: choose y among several samples, according to their relative weights
r =
w(y1, x) + · · · + w(yk , x)
w(x∗
1 , y) + · · · + w(x∗
k , y)
where w(x, y) = π(x)q(x, y)λ(x, y) = π(x)p(y | x)λ(x, y)
GMTM: weights w(.) do not need to represent probability distributions.
r =
π(y)pk (x | y)
π(x)pk (y | x)
Wx
Wy
where Wy = wi (yi ,x)
k
j=1 wj (yj ,x)
for the chosen element i

gene tree proposal with GMTM or MTM

Outline
1 The Model
2 The Sampling
3 The Code

RF distance, Assignment cost (Hdist)

A parallel pseudo-random number generator (PRNG)
Given a seed and an algorithm, we have a stream of PRNs.
PRNG1
PRNG2
PRNG2
PRNG2
PRNG2
x1
seed
x2
x3
x4
x11 x12

Given a seed and an algorithm, we have a stream of PRNs.
PRNG1
PRNG2
PRNG2
PRNG2
PRNG2
x1
seed
x2
x3
x4
x11 x12
Using a second algorithm, the ﬁrst
stream will give us a sequence of
seeds. We use the 150 parameter
sets for the Tausworthe (LFSR)
generators (L’ecuyer, Maths Comput
1999, pp.261).
Therefore, given the seed, we can
predict all states of all streams.

In our gene/species model:
PRNG1
PRNG2
PRNG2
PRNG2
PRNG2
x1
seed
x2
x3
x4
x11 x12
we split gene families among jobs

PRNG1
PRNG2
PRNG2
PRNG2
PRNG2
x1
seed
x2
x3
x4
x11 x12
all jobs receive seed (broadcast)
and therefore can reproduce the
same x1. That’s cheaper than
communicating the states.

PRNG1
PRNG2
PRNG2
PRNG2
PRNG2
x1
seed
x2
x3
x4
x11 x12
each job uses its own x(i+1) for
sampling new gene trees etc. and
can work in parallel. They use the
common x1 for sampling e.g. new
species tree, which needs
synchronization.

PRNG1
PRNG2
PRNG2
PRNG2
PRNG2
x1
seed
x2
x3
x4
x11 x12
each job uses its own x(i+1) for
sampling new gene trees etc. and
can work in parallel. They use the
common x1 for sampling e.g. new
species tree, which needs
synchronization.
the only thing that must be shared
is thus the proposal values
(AllReduce) when updating
”global” parameters”, so that all
jobs can make the same
acceptance/rejection decision.

Each job looks like an independent analysis

https://bitbucket.org/leomrtns/guenomu

guenomu software -- model and agorithm in 2013

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a guenomu software -- model and agorithm in 2013

Similar a guenomu software -- model and agorithm in 2013 (20)

Último

Último (20)

guenomu software -- model and agorithm in 2013