SlideShare a Scribd company logo
1 of 29
Download to read offline
DIC, a few years later
Christian P. Robert
Universit´e Paris-Dauphine, University of Warwick, & CREST, Paris
Outline
DIC as in Dayesian
mixed-up definitions
Addendum
Bayesian model comparison
use posterior probabilities/Bayes factors
B12 = Θ1
f1(y|θ1) dπ1(θ1)
Θ2
f2(y|θ2) dπ2(θ2)
[Jeffreys, 1939]
posterior predictive checks
Pr(mi (y) ≤ mi (y)|y)
[Gelman et al., 2013]
comparisons of models based on prediction error and other
loss-based measures
DIC? BIC? integrated likelihood?
Bayesian model comparison
use posterior probabilities/Bayes factors
B12 = Θ1
f1(y|θ1) dπ1(θ1)
Θ2
f2(y|θ2) dπ2(θ2)
[Jeffreys, 1939]
posterior predictive checks
Pr(mi (y) ≤ mi (y)|y)
[Gelman et al., 2013]
comparisons of models based on prediction error and other
loss-based measures
DIC? BIC? integrated likelihood?
thou shalt not use the data twice
The data is used twice in the DIC method:
1. y used once to produce the posterior π(θ|y), and the
associated estimate, ˜θ(y)
2. y used a second time to compute the posterior expectation
of the observed likelihood p(y|θ),
log p(y|θ)π(dθ|y) ∝ log p(y|θ)p(y|θ)π(dθ) ,
Aitkin’s expected likelihood
Strong similarity with Aitkin’s (1991) posterior Bayes factor
f (y|θ)dπ(θ|y) ,
and Aitkin’s (2010) integrated likelihood in that deviance
D(θ) = −2 log(f (x|θ))
is a renaming of the likelihood and is considered “a posteriori”
both in
¯D = E[D(θ)|x]
and in pD = ¯D − D(ˆθ)
[Gelman, X & Rousseau, 2013]
Aitkin’s arguments
Central tool for Aitkin’s (2010) model fit is “posterior cdf” of the
likelihood,
F(z) = Prπ
(L(θ, y) > ε|y) .
The approach is...
...general and resolves difficulties with Bayesian processing of
point null hypotheses
...open to using of generic noninformative and improper priors,
being related to a single model;
...more naturally handling the “vexed question of model fit,”
Aitkin’s arguments
Central tool for Aitkin’s (2010) model fit is “posterior cdf” of the
likelihood,
F(z) = Prπ
(L(θ, y) > ε|y) .
The approach is...
...general and resolves difficulties with Bayesian processing of
point null hypotheses
...open to using of generic noninformative and improper priors,
being related to a single model;
...more naturally handling the “vexed question of model fit,”
counter-arguments by Gelman et al., 2013
using priors and posteriors is no guarantee that inference is
Bayesian
improper priors give meaningless (or at least not universally
accepted) values for marginal likelihoods
likelihood function does not exist a priori
requires a joint distribution to be properly defined for model
comparison
[X & Marin, 2008]
DIC for missing data models
Framework of missing data models
f (y|θ) = f (y, z|θ)dz ,
with observed data y = (y1, . . . , yn) and corresponding missing
data by z = (z1, . . . , zn)
How do we define DIC in such settings?
DIC for missing data models
Framework of missing data models
f (y|θ) = f (y, z|θ)dz ,
with observed data y = (y1, . . . , yn) and corresponding missing
data by z = (z1, . . . , zn)
How do we define DIC in such settings?
how many DICs can you fit in a mixture?
1. observed DICs
DIC1 = −4Eθ [log f (y|θ)|y] + 2 log f (y|Eθ [θ|y])
often a poor choice in case of unidentifiability
2. complete DICs based on f (y, z|θ)
3. conditional DICs based on f (y|z, θ)
[Celeux et al., BA, 2006]
how many DICs can you fit in a mixture?
1. observed DICs
DIC2 = −4Eθ [log f (y|θ)|y] + 2 log f (y|θ(y)) .
which uses posterior mode instead
2. complete DICs based on f (y, z|θ)
3. conditional DICs based on f (y|z, θ)
[Celeux et al., BA, 2006]
how many DICs can you fit in a mixture?
1. observed DICs
DIC3 = −4Eθ [log f (y|θ)|y] + 2 log f (y) ,
which instead relies on the MCMC density estimate
2. complete DICs based on f (y, z|θ)
3. conditional DICs based on f (y|z, θ)
[Celeux et al., BA, 2006]
how many DICs can you fit in a mixture?
1. observed DICs
2. complete DICs based on f (y, z|θ)
DIC4 = EZ [DIC(y, Z)|y]
= −4Eθ,Z [log f (y, Z|θ)|y] + 2EZ [log f (y, Z|Eθ[θ|y, Z])|y] .
3. conditional DICs based on f (y|z, θ)
[Celeux et al., BA, 2006]
how many DICs can you fit in a mixture?
1. observed DICs
2. complete DICs based on f (y, z|θ)
DIC5 = −4Eθ,Z [log f (y, Z|θ)|y] + 2 log f (y, z(y)|θ(y)) ,
using Z as an additional parameter
3. conditional DICs based on f (y|z, θ)
[Celeux et al., BA, 2006]
how many DICs can you fit in a mixture?
1. observed DICs
2. complete DICs based on f (y, z|θ)
DIC6 = −4Eθ,Z [log f (y, Z|θ)|y]+2EZ[log f (y, Z|θ(y))|y, θ(y)] .
in analogy with EM, θ being an EM fixed point
3. conditional DICs based on f (y|z, θ)
[Celeux et al., BA, 2006]
how many DICs can you fit in a mixture?
1. observed DICs
2. complete DICs based on f (y, z|θ)
3. conditional DICs based on f (y|z, θ)
DIC7 = −4Eθ,Z [log f (y|Z, θ)|y] + 2 log f (y|z(y), θ(y)) ,
using MAP estimates
[Celeux et al., BA, 2006]
how many DICs can you fit in a mixture?
1. observed DICs
2. complete DICs based on f (y, z|θ)
3. conditional DICs based on f (y|z, θ)
DIC8 = −4Eθ,Z [log f (y|Z, θ)|y] + 2EZ log f (y|Z, θ(y, Z))|y ,
conditioning first on Z and then integrating over Z conditional
on y
[Celeux et al., BA, 2006]
Galactic DICs
Example of the galaxy mixture dataset
DIC2 DIC3 DIC4 DIC5 DIC6 DIC7 DIC8
K (pD2) (pD3) (pD4) (pD5) (pD6) (pD7) (pD8)
2 453 451 502 705 501 417 410
(5.56) (3.66) (5.50) (207.88) (4.48) (11.07) (4.09)
3 440 436 461 622 471 378 372
(9.23) (4.94) (6.40) (167.28) (15.80) (13.59) (7.43)
4 446 439 473 649 482 388 382
(11.58) (5.41) (7.52) (183.48) (16.51) (17.47) (11.37)
5 447 442 485 658 511 395 390
(10.80) (5.48) (7.58) (180.73) (33.29) (20.00) (15.15)
6 449 444 494 676 532 407 398
(11.26) (5.49) (8.49) (191.10) (46.83) (28.23) (19.34)
7 460 446 508 700 571 425 409
(19.26) (5.83) (8.93) (200.35) (71.26) (40.51) (24.57)
Further questions
what is the behaviour of DIC under model mispecification?
is there an absolute scale to the DIC values, i.e. when is a
difference in DICs significant?
how can DIC handle small n’s versus p’s?
In an era of complex models, is DIC applicable?
Further questions
what is the behaviour of DIC under model mispecification?
is there an absolute scale to the DIC values, i.e. when is a
difference in DICs significant?
how can DIC handle small n’s versus p’s?
In an era of complex models, is DIC applicable?
Likelihood-free settings
Cases when the likelihood function f (y|θ) is unavailable (in
analytic and numerical senses) and when the completion step
f (y|θ) =
Z
f (y, z|θ) dz
is impossible or too costly because of the dimension of z
c MCMC cannot be implemented!
The ABC method
Bayesian setting: target is π(θ)f (x|θ)
When likelihood f (x|θ) not in closed form, likelihood-free rejection
technique:
ABC algorithm
For an observation y ∼ f (y|θ), under the prior π(θ), keep jointly
simulating
θ ∼ π(θ) , z ∼ f (z|θ ) ,
until the auxiliary variable z is equal to the observed value, z = y.
[Tavar´e et al., 1997]
The ABC method
Bayesian setting: target is π(θ)f (x|θ)
When likelihood f (x|θ) not in closed form, likelihood-free rejection
technique:
ABC algorithm
For an observation y ∼ f (y|θ), under the prior π(θ), keep jointly
simulating
θ ∼ π(θ) , z ∼ f (z|θ ) ,
until the auxiliary variable z is equal to the observed value, z = y.
[Tavar´e et al., 1997]
The ABC method
Bayesian setting: target is π(θ)f (x|θ)
When likelihood f (x|θ) not in closed form, likelihood-free rejection
technique:
ABC algorithm
For an observation y ∼ f (y|θ), under the prior π(θ), keep jointly
simulating
θ ∼ π(θ) , z ∼ f (z|θ ) ,
until the auxiliary variable z is equal to the observed value, z = y.
[Tavar´e et al., 1997]
Generic ABC for model choice
Algorithm 1 Likelihood-free model choice sampler (ABC-MC)
for t = 1 to T do
repeat
Generate m from the prior π(M = m)
Generate θm from the prior πm(θm)
Generate z from the model fm(z|θm)
until ρ{η(z), η(y)} <
Set m(t) = m and θ(t)
= θm
end for
[Grelaud et al., 2009]
Concistency
Model selection feasible with ABC:
Choice of summary statistics is paramount
At best, ABC output → π(. | η(y)) which concentrates around
µ0
For estimation : {θ; µ(θ) = µ0} = θ0
For testing {µ1(θ1), θ1 ∈ Θ1} ∩ {µ2(θ2), θ2 ∈ Θ2} = ∅
[Marin et al., 2013]
Concistency
Model selection feasible with ABC:
Choice of summary statistics is paramount
At best, ABC output → π(. | η(y)) which concentrates around
µ0
For estimation : {θ; µ(θ) = µ0} = θ0
For testing {µ1(θ1), θ1 ∈ Θ1} ∩ {µ2(θ2), θ2 ∈ Θ2} = ∅
[Marin et al., 2013]

More Related Content

What's hot

CISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergenceCISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergenceChristian Robert
 
Tailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest NeighborsTailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest NeighborsFrank Nielsen
 
accurate ABC Oliver Ratmann
accurate ABC Oliver Ratmannaccurate ABC Oliver Ratmann
accurate ABC Oliver Ratmannolli0601
 
ABC based on Wasserstein distances
ABC based on Wasserstein distancesABC based on Wasserstein distances
ABC based on Wasserstein distancesChristian Robert
 
ABC short course: final chapters
ABC short course: final chaptersABC short course: final chapters
ABC short course: final chaptersChristian Robert
 
Predictive mean-matching2
Predictive mean-matching2Predictive mean-matching2
Predictive mean-matching2Jae-kwang Kim
 
ABC short course: introduction chapters
ABC short course: introduction chaptersABC short course: introduction chapters
ABC short course: introduction chaptersChristian Robert
 
ABC short course: model choice chapter
ABC short course: model choice chapterABC short course: model choice chapter
ABC short course: model choice chapterChristian Robert
 
Bayesian inference on mixtures
Bayesian inference on mixturesBayesian inference on mixtures
Bayesian inference on mixturesChristian Robert
 
ABC short course: survey chapter
ABC short course: survey chapterABC short course: survey chapter
ABC short course: survey chapterChristian Robert
 
Optimization of probabilistic argumentation with Markov processes
Optimization of probabilistic argumentation with Markov processesOptimization of probabilistic argumentation with Markov processes
Optimization of probabilistic argumentation with Markov processesEmmanuel Hadoux
 
from model uncertainty to ABC
from model uncertainty to ABCfrom model uncertainty to ABC
from model uncertainty to ABCChristian Robert
 
Monte Carlo in Montréal 2017
Monte Carlo in Montréal 2017Monte Carlo in Montréal 2017
Monte Carlo in Montréal 2017Christian Robert
 
Some sampling techniques for big data analysis
Some sampling techniques for big data analysisSome sampling techniques for big data analysis
Some sampling techniques for big data analysisJae-kwang Kim
 

What's hot (20)

CISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergenceCISEA 2019: ABC consistency and convergence
CISEA 2019: ABC consistency and convergence
 
Vancouver18
Vancouver18Vancouver18
Vancouver18
 
Tailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest NeighborsTailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest Neighbors
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
accurate ABC Oliver Ratmann
accurate ABC Oliver Ratmannaccurate ABC Oliver Ratmann
accurate ABC Oliver Ratmann
 
Boston talk
Boston talkBoston talk
Boston talk
 
ABC based on Wasserstein distances
ABC based on Wasserstein distancesABC based on Wasserstein distances
ABC based on Wasserstein distances
 
Invers fungsi
Invers fungsiInvers fungsi
Invers fungsi
 
ABC short course: final chapters
ABC short course: final chaptersABC short course: final chapters
ABC short course: final chapters
 
Predictive mean-matching2
Predictive mean-matching2Predictive mean-matching2
Predictive mean-matching2
 
ABC short course: introduction chapters
ABC short course: introduction chaptersABC short course: introduction chapters
ABC short course: introduction chapters
 
ABC short course: model choice chapter
ABC short course: model choice chapterABC short course: model choice chapter
ABC short course: model choice chapter
 
Bayesian inference on mixtures
Bayesian inference on mixturesBayesian inference on mixtures
Bayesian inference on mixtures
 
ABC short course: survey chapter
ABC short course: survey chapterABC short course: survey chapter
ABC short course: survey chapter
 
Optimization of probabilistic argumentation with Markov processes
Optimization of probabilistic argumentation with Markov processesOptimization of probabilistic argumentation with Markov processes
Optimization of probabilistic argumentation with Markov processes
 
Fi review5
Fi review5Fi review5
Fi review5
 
from model uncertainty to ABC
from model uncertainty to ABCfrom model uncertainty to ABC
from model uncertainty to ABC
 
MNAR
MNARMNAR
MNAR
 
Monte Carlo in Montréal 2017
Monte Carlo in Montréal 2017Monte Carlo in Montréal 2017
Monte Carlo in Montréal 2017
 
Some sampling techniques for big data analysis
Some sampling techniques for big data analysisSome sampling techniques for big data analysis
Some sampling techniques for big data analysis
 

Similar to RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013

Workshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinWorkshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinChristian Robert
 
random forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimationrandom forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimationChristian Robert
 
NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015Christian Robert
 
Asymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de FranceAsymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de FranceChristian Robert
 
3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
3rd NIPS Workshop on PROBABILISTIC PROGRAMMING3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
3rd NIPS Workshop on PROBABILISTIC PROGRAMMINGChristian Robert
 
Testing for mixtures at BNP 13
Testing for mixtures at BNP 13Testing for mixtures at BNP 13
Testing for mixtures at BNP 13Christian Robert
 
How many components in a mixture?
How many components in a mixture?How many components in a mixture?
How many components in a mixture?Christian Robert
 
The Intersection Axiom of Conditional Probability
The Intersection Axiom of Conditional ProbabilityThe Intersection Axiom of Conditional Probability
The Intersection Axiom of Conditional ProbabilityRichard Gill
 
Considerate Approaches to ABC Model Selection
Considerate Approaches to ABC Model SelectionConsiderate Approaches to ABC Model Selection
Considerate Approaches to ABC Model SelectionMichael Stumpf
 
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18Olga Zinkevych
 
ABC convergence under well- and mis-specified models
ABC convergence under well- and mis-specified modelsABC convergence under well- and mis-specified models
ABC convergence under well- and mis-specified modelsChristian Robert
 
Comparing estimation algorithms for block clustering models
Comparing estimation algorithms for block clustering modelsComparing estimation algorithms for block clustering models
Comparing estimation algorithms for block clustering modelsBigMC
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsChristian Robert
 

Similar to RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013 (20)

Workshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael MartinWorkshop in honour of Don Poskitt and Gael Martin
Workshop in honour of Don Poskitt and Gael Martin
 
random forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimationrandom forests for ABC model choice and parameter estimation
random forests for ABC model choice and parameter estimation
 
the ABC of ABC
the ABC of ABCthe ABC of ABC
the ABC of ABC
 
NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015NBBC15, Reyjavik, June 08, 2015
NBBC15, Reyjavik, June 08, 2015
 
Asymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de FranceAsymptotics of ABC, lecture, Collège de France
Asymptotics of ABC, lecture, Collège de France
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
3rd NIPS Workshop on PROBABILISTIC PROGRAMMING3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
3rd NIPS Workshop on PROBABILISTIC PROGRAMMING
 
Testing for mixtures at BNP 13
Testing for mixtures at BNP 13Testing for mixtures at BNP 13
Testing for mixtures at BNP 13
 
How many components in a mixture?
How many components in a mixture?How many components in a mixture?
How many components in a mixture?
 
The Intersection Axiom of Conditional Probability
The Intersection Axiom of Conditional ProbabilityThe Intersection Axiom of Conditional Probability
The Intersection Axiom of Conditional Probability
 
Ab cancun
Ab cancunAb cancun
Ab cancun
 
Considerate Approaches to ABC Model Selection
Considerate Approaches to ABC Model SelectionConsiderate Approaches to ABC Model Selection
Considerate Approaches to ABC Model Selection
 
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
Variational autoencoders for speech processing d.bielievtsov dataconf 21 04 18
 
CDT 22 slides.pdf
CDT 22 slides.pdfCDT 22 slides.pdf
CDT 22 slides.pdf
 
ABC convergence under well- and mis-specified models
ABC convergence under well- and mis-specified modelsABC convergence under well- and mis-specified models
ABC convergence under well- and mis-specified models
 
QMC: Transition Workshop - Small Sample Statistical Analysis and Algorithms f...
QMC: Transition Workshop - Small Sample Statistical Analysis and Algorithms f...QMC: Transition Workshop - Small Sample Statistical Analysis and Algorithms f...
QMC: Transition Workshop - Small Sample Statistical Analysis and Algorithms f...
 
Comparing estimation algorithms for block clustering models
Comparing estimation algorithms for block clustering modelsComparing estimation algorithms for block clustering models
Comparing estimation algorithms for block clustering models
 
Intractable likelihoods
Intractable likelihoodsIntractable likelihoods
Intractable likelihoods
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximations
 
PMED Transition Workshop - A Bayesian Model for Joint Longitudinal and Surviv...
PMED Transition Workshop - A Bayesian Model for Joint Longitudinal and Surviv...PMED Transition Workshop - A Bayesian Model for Joint Longitudinal and Surviv...
PMED Transition Workshop - A Bayesian Model for Joint Longitudinal and Surviv...
 

More from Christian Robert

Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?Christian Robert
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking componentsChristian Robert
 
discussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihooddiscussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihoodChristian Robert
 
NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)Christian Robert
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerChristian Robert
 
a discussion of Chib, Shin, and Simoni (2017-8) Bayesian moment models
a discussion of Chib, Shin, and Simoni (2017-8) Bayesian moment modelsa discussion of Chib, Shin, and Simoni (2017-8) Bayesian moment models
a discussion of Chib, Shin, and Simoni (2017-8) Bayesian moment modelsChristian Robert
 
Poster for Bayesian Statistics in the Big Data Era conference
Poster for Bayesian Statistics in the Big Data Era conferencePoster for Bayesian Statistics in the Big Data Era conference
Poster for Bayesian Statistics in the Big Data Era conferenceChristian Robert
 
short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018Christian Robert
 
ABC with Wasserstein distances
ABC with Wasserstein distancesABC with Wasserstein distances
ABC with Wasserstein distancesChristian Robert
 
Coordinate sampler: A non-reversible Gibbs-like sampler
Coordinate sampler: A non-reversible Gibbs-like samplerCoordinate sampler: A non-reversible Gibbs-like sampler
Coordinate sampler: A non-reversible Gibbs-like samplerChristian Robert
 
better together? statistical learning in models made of modules
better together? statistical learning in models made of modulesbetter together? statistical learning in models made of modules
better together? statistical learning in models made of modulesChristian Robert
 

More from Christian Robert (15)

discussion of ICML23.pdf
discussion of ICML23.pdfdiscussion of ICML23.pdf
discussion of ICML23.pdf
 
restore.pdf
restore.pdfrestore.pdf
restore.pdf
 
Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?Inferring the number of components: dream or reality?
Inferring the number of components: dream or reality?
 
Testing for mixtures by seeking components
Testing for mixtures by seeking componentsTesting for mixtures by seeking components
Testing for mixtures by seeking components
 
discussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihooddiscussion on Bayesian restricted likelihood
discussion on Bayesian restricted likelihood
 
NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)NCE, GANs & VAEs (and maybe BAC)
NCE, GANs & VAEs (and maybe BAC)
 
Coordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like samplerCoordinate sampler : A non-reversible Gibbs-like sampler
Coordinate sampler : A non-reversible Gibbs-like sampler
 
eugenics and statistics
eugenics and statisticseugenics and statistics
eugenics and statistics
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
a discussion of Chib, Shin, and Simoni (2017-8) Bayesian moment models
a discussion of Chib, Shin, and Simoni (2017-8) Bayesian moment modelsa discussion of Chib, Shin, and Simoni (2017-8) Bayesian moment models
a discussion of Chib, Shin, and Simoni (2017-8) Bayesian moment models
 
Poster for Bayesian Statistics in the Big Data Era conference
Poster for Bayesian Statistics in the Big Data Era conferencePoster for Bayesian Statistics in the Big Data Era conference
Poster for Bayesian Statistics in the Big Data Era conference
 
short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018short course at CIRM, Bayesian Masterclass, October 2018
short course at CIRM, Bayesian Masterclass, October 2018
 
ABC with Wasserstein distances
ABC with Wasserstein distancesABC with Wasserstein distances
ABC with Wasserstein distances
 
Coordinate sampler: A non-reversible Gibbs-like sampler
Coordinate sampler: A non-reversible Gibbs-like samplerCoordinate sampler: A non-reversible Gibbs-like sampler
Coordinate sampler: A non-reversible Gibbs-like sampler
 
better together? statistical learning in models made of modules
better together? statistical learning in models made of modulesbetter together? statistical learning in models made of modules
better together? statistical learning in models made of modules
 

Recently uploaded

TEACHER REFLECTION FORM (NEW SET........).docx
TEACHER REFLECTION FORM (NEW SET........).docxTEACHER REFLECTION FORM (NEW SET........).docx
TEACHER REFLECTION FORM (NEW SET........).docxruthvilladarez
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPCeline George
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptshraddhaparab530
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...Postal Advocate Inc.
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmStan Meyer
 

Recently uploaded (20)

TEACHER REFLECTION FORM (NEW SET........).docx
TEACHER REFLECTION FORM (NEW SET........).docxTEACHER REFLECTION FORM (NEW SET........).docx
TEACHER REFLECTION FORM (NEW SET........).docx
 
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptxLEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
LEFT_ON_C'N_ PRELIMS_EL_DORADO_2024.pptx
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
How to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERPHow to do quick user assign in kanban in Odoo 17 ERP
How to do quick user assign in kanban in Odoo 17 ERP
 
Integumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.pptIntegumentary System SMP B. Pharm Sem I.ppt
Integumentary System SMP B. Pharm Sem I.ppt
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
USPS® Forced Meter Migration - How to Know if Your Postage Meter Will Soon be...
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxINTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx
 
Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...Student Profile Sample - We help schools to connect the data they have, with ...
Student Profile Sample - We help schools to connect the data they have, with ...
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and Film
 

RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013

  • 1. DIC, a few years later Christian P. Robert Universit´e Paris-Dauphine, University of Warwick, & CREST, Paris
  • 2. Outline DIC as in Dayesian mixed-up definitions Addendum
  • 3. Bayesian model comparison use posterior probabilities/Bayes factors B12 = Θ1 f1(y|θ1) dπ1(θ1) Θ2 f2(y|θ2) dπ2(θ2) [Jeffreys, 1939] posterior predictive checks Pr(mi (y) ≤ mi (y)|y) [Gelman et al., 2013] comparisons of models based on prediction error and other loss-based measures DIC? BIC? integrated likelihood?
  • 4. Bayesian model comparison use posterior probabilities/Bayes factors B12 = Θ1 f1(y|θ1) dπ1(θ1) Θ2 f2(y|θ2) dπ2(θ2) [Jeffreys, 1939] posterior predictive checks Pr(mi (y) ≤ mi (y)|y) [Gelman et al., 2013] comparisons of models based on prediction error and other loss-based measures DIC? BIC? integrated likelihood?
  • 5. thou shalt not use the data twice The data is used twice in the DIC method: 1. y used once to produce the posterior π(θ|y), and the associated estimate, ˜θ(y) 2. y used a second time to compute the posterior expectation of the observed likelihood p(y|θ), log p(y|θ)π(dθ|y) ∝ log p(y|θ)p(y|θ)π(dθ) ,
  • 6. Aitkin’s expected likelihood Strong similarity with Aitkin’s (1991) posterior Bayes factor f (y|θ)dπ(θ|y) , and Aitkin’s (2010) integrated likelihood in that deviance D(θ) = −2 log(f (x|θ)) is a renaming of the likelihood and is considered “a posteriori” both in ¯D = E[D(θ)|x] and in pD = ¯D − D(ˆθ) [Gelman, X & Rousseau, 2013]
  • 7. Aitkin’s arguments Central tool for Aitkin’s (2010) model fit is “posterior cdf” of the likelihood, F(z) = Prπ (L(θ, y) > ε|y) . The approach is... ...general and resolves difficulties with Bayesian processing of point null hypotheses ...open to using of generic noninformative and improper priors, being related to a single model; ...more naturally handling the “vexed question of model fit,”
  • 8. Aitkin’s arguments Central tool for Aitkin’s (2010) model fit is “posterior cdf” of the likelihood, F(z) = Prπ (L(θ, y) > ε|y) . The approach is... ...general and resolves difficulties with Bayesian processing of point null hypotheses ...open to using of generic noninformative and improper priors, being related to a single model; ...more naturally handling the “vexed question of model fit,”
  • 9. counter-arguments by Gelman et al., 2013 using priors and posteriors is no guarantee that inference is Bayesian improper priors give meaningless (or at least not universally accepted) values for marginal likelihoods likelihood function does not exist a priori requires a joint distribution to be properly defined for model comparison [X & Marin, 2008]
  • 10. DIC for missing data models Framework of missing data models f (y|θ) = f (y, z|θ)dz , with observed data y = (y1, . . . , yn) and corresponding missing data by z = (z1, . . . , zn) How do we define DIC in such settings?
  • 11. DIC for missing data models Framework of missing data models f (y|θ) = f (y, z|θ)dz , with observed data y = (y1, . . . , yn) and corresponding missing data by z = (z1, . . . , zn) How do we define DIC in such settings?
  • 12. how many DICs can you fit in a mixture? 1. observed DICs DIC1 = −4Eθ [log f (y|θ)|y] + 2 log f (y|Eθ [θ|y]) often a poor choice in case of unidentifiability 2. complete DICs based on f (y, z|θ) 3. conditional DICs based on f (y|z, θ) [Celeux et al., BA, 2006]
  • 13. how many DICs can you fit in a mixture? 1. observed DICs DIC2 = −4Eθ [log f (y|θ)|y] + 2 log f (y|θ(y)) . which uses posterior mode instead 2. complete DICs based on f (y, z|θ) 3. conditional DICs based on f (y|z, θ) [Celeux et al., BA, 2006]
  • 14. how many DICs can you fit in a mixture? 1. observed DICs DIC3 = −4Eθ [log f (y|θ)|y] + 2 log f (y) , which instead relies on the MCMC density estimate 2. complete DICs based on f (y, z|θ) 3. conditional DICs based on f (y|z, θ) [Celeux et al., BA, 2006]
  • 15. how many DICs can you fit in a mixture? 1. observed DICs 2. complete DICs based on f (y, z|θ) DIC4 = EZ [DIC(y, Z)|y] = −4Eθ,Z [log f (y, Z|θ)|y] + 2EZ [log f (y, Z|Eθ[θ|y, Z])|y] . 3. conditional DICs based on f (y|z, θ) [Celeux et al., BA, 2006]
  • 16. how many DICs can you fit in a mixture? 1. observed DICs 2. complete DICs based on f (y, z|θ) DIC5 = −4Eθ,Z [log f (y, Z|θ)|y] + 2 log f (y, z(y)|θ(y)) , using Z as an additional parameter 3. conditional DICs based on f (y|z, θ) [Celeux et al., BA, 2006]
  • 17. how many DICs can you fit in a mixture? 1. observed DICs 2. complete DICs based on f (y, z|θ) DIC6 = −4Eθ,Z [log f (y, Z|θ)|y]+2EZ[log f (y, Z|θ(y))|y, θ(y)] . in analogy with EM, θ being an EM fixed point 3. conditional DICs based on f (y|z, θ) [Celeux et al., BA, 2006]
  • 18. how many DICs can you fit in a mixture? 1. observed DICs 2. complete DICs based on f (y, z|θ) 3. conditional DICs based on f (y|z, θ) DIC7 = −4Eθ,Z [log f (y|Z, θ)|y] + 2 log f (y|z(y), θ(y)) , using MAP estimates [Celeux et al., BA, 2006]
  • 19. how many DICs can you fit in a mixture? 1. observed DICs 2. complete DICs based on f (y, z|θ) 3. conditional DICs based on f (y|z, θ) DIC8 = −4Eθ,Z [log f (y|Z, θ)|y] + 2EZ log f (y|Z, θ(y, Z))|y , conditioning first on Z and then integrating over Z conditional on y [Celeux et al., BA, 2006]
  • 20. Galactic DICs Example of the galaxy mixture dataset DIC2 DIC3 DIC4 DIC5 DIC6 DIC7 DIC8 K (pD2) (pD3) (pD4) (pD5) (pD6) (pD7) (pD8) 2 453 451 502 705 501 417 410 (5.56) (3.66) (5.50) (207.88) (4.48) (11.07) (4.09) 3 440 436 461 622 471 378 372 (9.23) (4.94) (6.40) (167.28) (15.80) (13.59) (7.43) 4 446 439 473 649 482 388 382 (11.58) (5.41) (7.52) (183.48) (16.51) (17.47) (11.37) 5 447 442 485 658 511 395 390 (10.80) (5.48) (7.58) (180.73) (33.29) (20.00) (15.15) 6 449 444 494 676 532 407 398 (11.26) (5.49) (8.49) (191.10) (46.83) (28.23) (19.34) 7 460 446 508 700 571 425 409 (19.26) (5.83) (8.93) (200.35) (71.26) (40.51) (24.57)
  • 21. Further questions what is the behaviour of DIC under model mispecification? is there an absolute scale to the DIC values, i.e. when is a difference in DICs significant? how can DIC handle small n’s versus p’s? In an era of complex models, is DIC applicable?
  • 22. Further questions what is the behaviour of DIC under model mispecification? is there an absolute scale to the DIC values, i.e. when is a difference in DICs significant? how can DIC handle small n’s versus p’s? In an era of complex models, is DIC applicable?
  • 23. Likelihood-free settings Cases when the likelihood function f (y|θ) is unavailable (in analytic and numerical senses) and when the completion step f (y|θ) = Z f (y, z|θ) dz is impossible or too costly because of the dimension of z c MCMC cannot be implemented!
  • 24. The ABC method Bayesian setting: target is π(θ)f (x|θ) When likelihood f (x|θ) not in closed form, likelihood-free rejection technique: ABC algorithm For an observation y ∼ f (y|θ), under the prior π(θ), keep jointly simulating θ ∼ π(θ) , z ∼ f (z|θ ) , until the auxiliary variable z is equal to the observed value, z = y. [Tavar´e et al., 1997]
  • 25. The ABC method Bayesian setting: target is π(θ)f (x|θ) When likelihood f (x|θ) not in closed form, likelihood-free rejection technique: ABC algorithm For an observation y ∼ f (y|θ), under the prior π(θ), keep jointly simulating θ ∼ π(θ) , z ∼ f (z|θ ) , until the auxiliary variable z is equal to the observed value, z = y. [Tavar´e et al., 1997]
  • 26. The ABC method Bayesian setting: target is π(θ)f (x|θ) When likelihood f (x|θ) not in closed form, likelihood-free rejection technique: ABC algorithm For an observation y ∼ f (y|θ), under the prior π(θ), keep jointly simulating θ ∼ π(θ) , z ∼ f (z|θ ) , until the auxiliary variable z is equal to the observed value, z = y. [Tavar´e et al., 1997]
  • 27. Generic ABC for model choice Algorithm 1 Likelihood-free model choice sampler (ABC-MC) for t = 1 to T do repeat Generate m from the prior π(M = m) Generate θm from the prior πm(θm) Generate z from the model fm(z|θm) until ρ{η(z), η(y)} < Set m(t) = m and θ(t) = θm end for [Grelaud et al., 2009]
  • 28. Concistency Model selection feasible with ABC: Choice of summary statistics is paramount At best, ABC output → π(. | η(y)) which concentrates around µ0 For estimation : {θ; µ(θ) = µ0} = θ0 For testing {µ1(θ1), θ1 ∈ Θ1} ∩ {µ2(θ2), θ2 ∈ Θ2} = ∅ [Marin et al., 2013]
  • 29. Concistency Model selection feasible with ABC: Choice of summary statistics is paramount At best, ABC output → π(. | η(y)) which concentrates around µ0 For estimation : {θ; µ(θ) = µ0} = θ0 For testing {µ1(θ1), θ1 ∈ Θ1} ∩ {µ2(θ2), θ2 ∈ Θ2} = ∅ [Marin et al., 2013]