Jylee probabilistic reasoning with bayesian networks

Probabilistic Reasoning
in Bayesian Networks

KAIST AIPR Lab.
Jung-Yeol Lee
17th June 2010

1

KAIST AIPR Lab.

Contents

• Backgrounds
• Bayesian Network
• Semantics of Bayesian Network
• D-Separation
• Conditional Independence Relations
• Probabilistic Inference in Bayesian Networks
• Summary

2

KAIST AIPR Lab.

Bayesian Network

• Causal relationships among random variables
• Directed acyclic graph
 Node X i : random variables
 Directed links: probabilistic relationships between variables
 Acyclic: no links from any node to any lower node
• Link from node X to node Y, X is Parent (Y )
• Conditional probability distribution of X i
 P( X i | Parents ( X i ))
 Effect of the parents on the node X i

4

KAIST AIPR Lab.

Example of Bayesian Network

• Burglary network P(E)
0.002
P(B)
Burglary Earthquake
0.001

B E P(A|B,E)
T T 0.95
Alarm
T F 0.94
A P(J|A) F T 0.29
T 0.90 F F 0.001
F 0.05 JohnCalls Conditional Probability Tables

Directly influenced by Alarm A P(M|A)
MaryCalls
P( J | M  A  E  B)  P( J | A)
T 0.70
F 0.01

5

KAIST AIPR Lab.

Semantics of Bayesian Network

• Full joint probability distribution
 Notation: P( x1 ,, xn ) abbreviated from P( X1  x1    X n  xn )
n
 P( x1 ,, xn )   P( xi | parents ( X i )),
i 1

where parents ( X i ) is the specific values of the variables in Parents ( X i )
• Constructing Bayesian networks
n

 P( x1 ,, xn )   P(xi | xi 1 ,, x1 ) by chain rule
i 1
 For every variable X i in the network,
• P( X i | X i 1 ,, X1 )  P( X i | Parents ( X i )) provided that Parents ( X i )  {X i 1 ,, X1}

 Correctness
• Choose parents for each node s.t. this property holds

6

KAIST AIPR Lab.

Semantics of Bayesian Network (cont’d)

• Compactness
 Locally structured system
• Interacts directly with only a bounded number of components
 Complete network specified by n2 k conditional probabilities
where at most k parents
• Node ordering
 Add “root causes” first
 Add variables influenced, and so on
 Until reach the “leaves”
• “Leaves”: no direct causal influence on others

7

KAIST AIPR Lab.


Head-to-Head Connection
• Node c is said to be head-to-head
P(a, b, c)  P(a) P(b) P(c | a, b)
a b
 P(a, b, c)  P(a, b),  P(a) P(b) P(c | a, b)  P(a) P(b)
c c c

a 
b| 0

a b P ( a, b | c ) 
P(a, b, c) P(a) P(b) P(c | a, b)

P (c ) P (c )
c a b| c

• When node c is unobserved,

10

KAIST AIPR Lab.

D-separation

• Let A, B, and C be arbitrary nonintersecting sets of nodes
• Paths from A to B is blocked if it includes either,
 Head-to-tail or tail-to-tail node, and node is in C
 Head-to-head node, and node and its descendants is not in C
• A is d-separated from B by C if,
 Any node in possible paths from A to B blocks the path

a f a f

e b e b

c c
a b|c a b| f
11

KAIST AIPR Lab.

Conditional Independence Relations

• Conditionally independent of
U1 Um
its non-descendants, given its
parents Z1j X Znj

• Conditionally independent of Y1 Yn

all other nodes, given its
Markov blanket*
U1 Um
• In general, d-separation is used for
deciding independence Z1j X Znj

Y1 Yn

* Parents, children, and children’s other parents

12

KAIST AIPR Lab.

Probabilistic Inference In Bayesian Networks

• Notation
 X: the query variable
 E: the set of evidence variables, E1,…,Em
 e: particular observed evidences
• Compute posterior probability distribution P( X | e)
• Exact inference
 Inference by enumeration
 Variable elimination algorithm
• Approximate inference
 Direct sampling methods
 Markov chain Monte Carlo (MCMC) algorithm
13

KAIST AIPR Lab.

Exact Inference In Bayesian Networks

Inference By Enumeration
• P( X | e)  P( X , e)    P( X , e, y) where y is hidden var iable
y
• Recall, n

 P( x1 ,, xn )   P( xi | parents ( X i ))
i 1
• Computing sums of products of conditional probabilities
• In Burglary example,
B E
 P( B | j, m)  P( B, j, m)    P( B, e, a, j, m)
e a

P(b | j , m)    P(b) P(e) P(a | b, e) P( j | a) P(m | a) A
e a

 P(b) P(e) P(a | b, e) P( j | a) P(m | a) J M
e a

• O(2n) time complexity for n Boolean variables

14

KAIST AIPR Lab.


Variable Elimination Algorithm
• Eliminating repeated calculations of Enumeration
P( B | j, m)  P( B) P( E ) P(a | B, e) P( j | a) P(m | a)
e a

Repeated calculations

15

KAIST AIPR Lab.


Variable Elimination Algorithm (cont’d)
• Evaluating in right-to-left order (bottom-up) B E
 P( B | j, m)  P( B) P( E ) P(a | B, e) P( j | a) P(m | a)
e a
• Each part of the expression makes factor A

  P(m | a)   P( j | a)  J M
f M ( A)   , f J ( A)  
 P(m | a   P( j | a 

   
• Pointwise product
 f ( A)   P( j | a) P(m | a) 
 
 
 P ( j |  a ) P ( m | a ) 
JM

f AJM ( B, E )   f A (a, B, E )  f J (a)  f M (a)
a

f E AJM ( B)   f E (e)  f AJM ( B, e)
e

P( B | j , m)  f B ( B)  f E AJM ( B)
16

KAIST AIPR Lab.


Variable Elimination Algorithm (cont’d)
• Repeat removing any leaf node that is not a query variable or
an evidence variable
• In Burglary example, P( J | B  true) B E
 P( J | b)  P(b) P(e) P(a | b, e) P( J | a) P(m | a)
e a m
A
 P(b) P(e) P(a | b, e) P( J | a)
e a
J M
• Time and space complexity
 Dominated by the size of the largest factor
 In the worst case, exponential time and space complexity

17

KAIST AIPR Lab.

Approximate Inference In Bayesian Networks

Direct Sampling Methods
• Generating of samples from known probability distribution
• Sample each variable in topological order
• Function Prior-Sample(bn) returns an event sampled from the prior specified by bn
inputs: bn, a Bayesian network specifying joint distribution P(X1,…,Xn)

x ← an event with n elements
for i=1 to n do
xi ← a random sample from P(Xi | parents(Xi))
return x

• S PS ( x1 ,..., xn ) : the probability of specific event from Prior-Sample
n
S PS ( x1 ,..., xn )   P( xi | parents ( X i ))  P( x1 , , xn )
i 1

N PS ( x1 ,..., xn )
lim  S PS ( x1 ,..., xn )  P( x1 , , xn ) (Consistent estimate)
N  N
where N(x1,...,xn ) is the frequency of the event x1 , , xn
18

KAIST AIPR Lab.


Rejection Sampling Methods
• Rejecting samples that is inconsistent with evidence
• Estimate by counting how often X  x occurs
 P( X | e)  N PS ( X , e)  N PS ( X , e)
ˆ
N PS (e)
P ( X , e)
  P ( X | e) (Consistent estimate)
P ( e)
• Rejects samples exponentially as the number of evidence
variables grows

19

KAIST AIPR Lab.


Likelihood weighting
• Generating only consistent events w.r.t. the evidence
 Fixes the values for the evidence variables E
 Samples only the remaining variables X and Y
• function Likelihood-Weighting(X, e, bn, N) returns an estimate of P(X|e)
local variables: W, a vector of weighted counts over X, initially zero
for i=1 to N do
x, w ← Weighted-Sample(bn, e)
W[x] ← W[x]+w where x is the value of X in x
Return Normalize(W[X])

function Weighted-Sample(bn, e) returns an event and a weight
x ← an event with n elements; w ← 1
for i=1 to n do
if Xi has a value xi in e
then w ← w  P( X i  xi | parents ( X i ))
else xi ← a random sample from P( X i | parents ( X i ))
return x, w

20

KAIST AIPR Lab.


Likelihood weighting (cont’d)
• Sampling distribution SWS by Weighted-Sample
l

 SWS ( z, e)   P( zi | parents (Zi )) where Z  {X} Y
i 1
• The likelihood weight w(z,e)
m
 w( z, e)   P(ei | parents ( Ei ))
i 1
• Weighted probability of a sample
l m
 SWS ( z, e)w( z, e)   P( zi | parents (Z i )) P(ei | parents ( Ei )
i 1 i 1

 P ( z , e)

21

KAIST AIPR Lab.


Markov Chain Monte Carlo Algorithm

• Generating event by random change to one of nonevidence
variables Zi
• Zi conditioned on current values in the Markov blanket of Zi
• State specifying a value for every variables
• Long-run fraction of time spent in each state  P( X | e)
• functionvariables: N[X], e, bn, N) returns an estimate of P(X|e)
local
MCMC-Ask(X,
a vector of counts over X, initially zero
Z, the nonevidence variables in bn
x, the current state of the network, initially copied from e
initialize x with random values for the variables in Z
for j=1 to N do
for each Zi in Z do
sample the value of Zi in x from P(Zi | mb(Zi )) given the values of mb( Z i ) in x
N[x]←N[x] + 1 where x is the value of X in x
return Normalize(N[X])

22

KAIST AIPR Lab.


Markov Chain Monte Carlo Algorithm (cont’d)

• Markov chain on the state space
 q( x  x) : the probability of transition from state x to state x
• Consistency
 Let X i be all the hidden var iables other than X i
q( x  x)  q(( xi , xi )  ( xi, xi ))  P( xi | xi , e), called Gibbs sampler
 Markov chain reached its stationary distribution if it has detailed
balance

23

KAIST AIPR Lab.

Summary

• Bayesian network
 Directed acyclic graph expressing causal relationship
• Conditional independence
 D-separation property
• Inference in Bayesian network
 Enumeration: intractable
 Variable elimination: efficient, but sensitive to topology
 Direct sampling: estimate posterior probabilities
 MCMC algorithm: powerful method for computing with
probability models

24

KAIST AIPR Lab.

References

[1] Stuart Russell et al., “Probabilistic Reasoning”, Artificial
Intelligence A Modern Approach, Chapter 14, pp.492-519
[2] Eugene Charniak, "Bayesian Networks without Tears", 1991
[3] C. Bishop, “Graphical Models”, Pattern Recognition and
Machine Learning, Chapter 8, pp.359-418

25

KAIST AIPR Lab.

Q&A

• Thank you

26

KAIST AIPR Lab.

Appendix 1. Example of Bad Node Ordering

• Two more links and unnatural probability judgments
① ②
MaryCalls
JohnCalls

③
Alarm

④ ⑤
Burglary Earthquake

27

KAIST AIPR Lab.

Appendix 2. Consistency of Likelihood Weighting

• P( x | e)    NWS ( x, y, e) w( x, y, e)
ˆ from Likelihood-Weighting
y

  '  SWS ( x, y, e) w( x, y, e) for large N
y

  '  P ( x, y , e)
y

  ' P ( x , e)
 P ( x | e) (Consistent estimate)

28

KAIST AIPR Lab.

Appendix 2. State Distribution of MCMC

• Detailed balance
 Let πt(x) be the probability of systembeing in state x at time t
 ( x)q( x  x)   ( x)q( x  x) for all x, x

• Gibbs sampler, q( x  x)  q(( xi , xi )  ( xi, xi ))  P( xi | xi , e)
  ( x)q( x  x)  P( x | e) P( xi | xi , e)  P( xi , xi | e) P( xi | xi , e)
 P( xi | xi , e) P( xi | e) P( xi | xi , e) by chain rule on P( xi , xi | e)
 P( xi | xi , e) P( xi, xi | e) by backwards chain rule
 q(x  x)  (x)
• Stationary distribution if  t   t 1
  t 1 ( x)    ( x)q( x  x)    ( x)q( x  x)
x x

  ( x) q( x  x)   ( x)
x

29

Jylee probabilistic reasoning with bayesian networks

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (18)

Similar a Jylee probabilistic reasoning with bayesian networks

Similar a Jylee probabilistic reasoning with bayesian networks (19)

Último

Último (20)

Jylee probabilistic reasoning with bayesian networks