SlideShare una empresa de Scribd logo
1 de 10
Descargar para leer sin conexión
The Sum-Product Algorithm
        PRML 8.4.4

         Xuebin Ma
•   Factor graph

    •   undirected tree, directed tree, ploy tree (F8.43)

•   Goal:

    •   Obtain an efficient, exact inference algorithm for
        finding marginals

    •   Compute efficiently where several marginals are
        require
den. Later we shall see how to modify p(x) algorithm to incorporate evidenc
                                p(x) =
                                           the                   (8.61)
 onding to observed variables. By definition, the marginal is obtained by sum
         •  x denotes the set of variables in x with variable x omitted. The idea is
                                       xx
he joint distribution over all variables except x so that node x
             Calculate marginals for particular variable
   where x
   to substitute for p(x) using the factor graph expression (8.59) and then interchange
                                    p(x) =
   summations and products in order to obtain p(x)   an efficient algorithm. Consider the (8.61
   fragment of graph shown in Figure 8.46xin which we see that8.61 tree structure of
                                                 x
                                                                      F
                                                                         the
   the graph allows us to partitiondistribution overthe variables except x into groups, with
                       sum the joint the factors in all joint distribution
x  x group associated with each of the factor x with variable x omitted. The idea
   one denotes the set of variables in nodes that is a neighbour of the variable
          •
   node x. We see using the factor graph expression (8.59)factorsform
titute for p(x)Joint distribution in form a production of andthe interchang
                    that the joint distribution can be written as a product of then
tions and products in order to obtain san Xs )
                                        p(x) =        F (x, efficient algorithm. Consider th
                                                                                    (8.62)
nt 404 graph shown in Figure 8.46ne(x) which we see 8.62 the tree structure
    of      8. GRAPHICAL MODELS
                                                s∈ in               F that
ph ne(x) denotes the set of factor nodes that in the joint distribution into groups, wi
    allows us to partition the factors are neighbours of x, and X denotes the
                                                                             s
upset of all variables in the subtree connected to the variable node x via the factor node
     associated evaluation of the marginal p(x).
        Figure 8.46 with each of graph illustrating the
                    A fragment of a factor the factor nodes that is a neighbour of the variab

  We see that the joint distribution can be written as a product of the form
                                                                    µ   (x)
                                                                    fs →x
                                                 Fs (x, Xs )

                           p(x) =              Fs (x, Xs )     fs           x
                                                                                        (8.62
                                     s∈ne(x)

denotes the set of factor nodes that are neighbours of x, and Xs denotes th
ll variables in theand Fs(x, Xs)connected to theall the factors innode x associated factor nod
                 fs , subtree represents the product of variable the group via the
x, Xs ) represents the product of all the factors in the group associated
g.
fs(8.62) into (8.61) and interchanging the sums and products, we ob-
       •
uting (8.62) into and product
           sum (8.61) and interchanging the sums and products, we ob-
           p(x) =                    Fs (x, Xs )
              p(x) = ne(x)
                   s∈           Xs        Fs (x, Xs )
                                                             F 8.61, F8.62 -> F8.63
                          s∈ne(x) Xs
                 =           µfs →x (x).                            (8.63)
                      = ne(x)
                      s∈            µfs →x (x).                          (8.63)
                           s∈ne(x)
ntroduced a set of   functions µfs →x (x),
                                         defined by
ve introduced a set of functions µfs →x (x), defined by
             µfs →x (x) ≡      Fs (x, Xs )                          (8.64)
                 µfs →x (x)Xs
                            ≡       Fs (x, Xs ) F 8.64                   (8.64)
                                  Xs
 iewed as messages8.63 message from factor node to to variable node x x.
                    F  from the factor nodes fs fx the variable node
be viewed marginal p(x)from the by the nodes fsof all the incoming x.
  required as messages is given factor product to the variable node
 ng atrequired marginalproductis given by the product of all the incoming
  the node x.       F 8.64
                           p(x) of all incoming messages arriving at node x
 riving at these x.
 evaluate node messages, we again turn to Figure 8.46 and note that
rx, Xs ) is describedmessages, we again turnandFigure 8.46 and fac- that
   to evaluate these by a factor (sub-)graph to so can itself be note
 cular, we can write
 Fs (x, Xs ) is described by a factor (sub-)graph and so can itself be fac-
particular, we, can, write (x , X ) . . . G (x , X )
 ) = f (x, x . . . x )G
      s      1        M    1    1    s1       M     M   sM             (8.65)
 enience,fwe have .denoted)G1 variables )associated with factor fx , in
 Xs ) = s (x, x1 , . . , xM the (x1 , Xs1 . . . GM (xM , XsM )        (8.65)
messages arriving at node x. a set of functions µfs →x (x), defined by
        Here we have introduced
       In order to evaluate these messages, we again s
                                                    f turn to Figure 8.46 and note that
        •   Evaluate is describeds →x (x)
                                   µf by a factor Fs (x, Xs )
  each factor Fs (x, Xs )these messages≡ (sub-)graph and so can x
  torized. In particular, we can write          Xs         µfs →x (x)
                                                                                     (8.64)
                                                                          itself be fac-

       which can ) =viewedxas . . . , xM )G1 (x1the s1 ) . . . nodes fM to the variable node x.
        Fs (x, Xs be fs (x, 1 , messagesxfrom , Xfactor GM (x s , XsM )                          (8.65)
                                                    m
       We see that the required marginal p(x) is given by the product                 Models         405
                                                          8.4. Inference in Graphicalof all the incoming
 where, for convenience, we have denoted the X )
       messages arriving at node x. Gm (xm , variables associated with factor fx , in
         Figure 8.47 x , . . . x the factorization of the    sm as-       x
 addition to x, by Illustration, of Mthese messages, subgraphillustratedFigure 8.468.47. note that
                                      . This factorizationagain turn M in Figure and Note
            In order sociated with factor node fs .
                      to1 evaluate                        we    is          to
                                                                                     µx →f (xM )
 that the set of variables {x,is 1 , . . . , xM byis the set (sub-)graph and so can itself be fac-
       each factor Fs (x, Xs ) xdescribed } a factor of variables on which the factor    M   s


 fs depends, andparticular,alsocan denoted xs , using the notation of (8.59).
       torized. In so it can we be write
                                                                                   fs
 denotes the set of(8.65) into (8.64)that are neighbours of the factor node
      Substituting variable nodes we obtain                                                            x
              Fs (x, Xs ) = fs (x, x1 , . . . , xM )G1 (x1 , Xs1 ) . . . GM (xM , XsM )→x (x) (8.65)
s)  x denotes the same set but with node x removed. Here we have                        µf  s


ollowing messages from. variable have denoted the nodes xassociated (x , factor) fx , in
   µ where, for convenience, we(x, x , . to ,factor variables m
         (x) =        . .    f nodes . . x )                    G with X
     fs →x                              s        1       M                        m m     sm
        addition to x, by x1 , . . . , xM . This factorization is illustrated in Figure 8.47. Note
                       x1      xM                            m∈ne(fs )Gm (xmxm sm )
                                                                      x X , X
        that µxmset of variables {x,Gm (x. , , Xsm ).the set of variables(8.67)
             the →fs (xm ) ≡              x1 , . . m xM } is                     on which the factor
                 =        . . Xsm be , . . . , x x)
        fs depends, and so.it can fs (x, x1denotedM s , usingF 8.67 xm →fsof (8.59). (8.66)
                                      also                         the notation (xm )
                                                                        µ
             Substituting (8.65))into (8.64)set ofobtainm∈ne(fthat are neighbours of the factor node
                     where ne(f Mdenotes the
                       x1     x              we variable nodes s )x
                                  s
efore introduced two, distinctskinds of message, those that go from factor Here we have
                   fs and ne(f )  x denotes the same set but with node x removed.
        µfs denoted= f →x (x), andfmessages from x from variable nodes to
able nodes→x (x) defined the .following those 1 , . . .go M ) nodes to factor nodes (xm , Xsm )
                     µ        ..       s (x, x that , variable                 Gm
denoted µx→f (x). In each case, we see that )messages(x s )x Xxm a
                         x1       xM     µx →f (xm ≡         Gm m
                                                                 passed along
                                                            m∈ne(f , Xsm ).               (8.67)
ys a function of the variable associated with the variable node that link
                                                 m   s


                        =                  fs (x, x1 , . . . , xM )            µxm →fs (xm )
                                                          X
                                  ...                             sm
                                                                                                      (8.66)
                    We have 1
                            x therefore introduced two distinct kinds ne(message, those that go from factor
                                     xM                           m∈ of fs )x
t (8.66) says that to evaluate thenodes denoted µf →x (x),factor node to a vari-
                    nodes to variable message sent by a and those that go from variable nodes to
ng the link connecting them,denoted µx→f (x). In of the incoming messagespassed along a
                    factor nodes take the product each case, we see that messages
                    link are always a function of the variable associated with the variable node that link
always a function of the variable associated with the variable node that link
 s to.
e result (8.66) says that to evaluate the message sent by a factor node to a vari-
     •
de along the link connecting them, take the product of thefrom variable to factor
 CAL MODELS
           Evaluate messages from messages incoming messages
ll other linksusing sub-graph factorization the factor associated
           by coming into the factor node, multiply by
at node, and then marginalize over all of the variables associated with the
ng406 of the8. GRAPHICAL MODELS sent by a It fL important to note that
    messages. evaluationillustrated in Figure 8.47. is
stration        This is of the message
 able node to an adjacent factor node.
  node can send a message to a variable node once it has received incoming
          Figure 8.48 Illustration of variable nodes.
es from all other neighbouring the evaluation of the message sent by a  fL
                      variable node to an adjacent factor node.
 ally, we derive an expression for evaluating the messages from variable nodes
 r nodes, again by making use of the (sub-)graph factorization. From Fig- s
                                                                       xm        f
8, we see that term Gm (xm , Xsm ) associated with node xm is given by a
                                                                                     fs
  of terms Fl (xm , Xml ) each associated with one of the factor nodes fl that is xm
o node xm (excluding node fs ), so that                         fl
                                                                               fl
                                                       Fl (xm , Xml )
             Gm (xm , Xsm ) =                   Fl (xm , Xml )               Fl (xm(8.68)
                                                                                   , Xml )
                                 l∈ne(xm )fs
                                                            F 8.68
 n obtain
 he product is taken overobtain
                     then all neighbours of node xm      except for node fs .Xm except for node fs
                                                           product of node Note
ch of the factors Fl (xm , Xml ) represents a subtree of the original graph of
 y the same µxm →fs (xm ) = in xm →fs (xm ) = Fl (xm (8.68)) into l(8.67),ml )
             kind as introduced µ (8.62). Substituting , Xml      F (xm , X we
                                 l∈ne(xm )fs    Xml   l∈ne(xm )fs    Xml


                            =                       =
                                                µfl →xm (xm )         µfl →xm (xm ) (8.69)       (8.69)
                                                       l∈ne(xm )fs
                                 l∈ne(xm )fs
                                                                               F 8.67 + F 8.68 -> F 8.69
                    where we have used the definition (8.64) of the messages passed from factor nodes to
ere we have used the definition (8.64) of the messages passed from factor nodes to
from (8.66) that the message sent should take the form

     •    Message send by leaf(variable fnode = f (x)factor node)
                                       µ →x (x) and                                                        (8.71)

         Figure 8.49   The sum-product algorithm              µx→f (x) = 1             µf →x (x) = f (x)
                       begins with messages sent
                       by the leaf nodes, which de-
                       pend on whether the leaf           x                   f        f                   x
                       node is (a) a variable node,
                                                                   (a)                        (b)
                       or (b) a factor node.

     •    Find marginals for every variable node introduced by John-san

08   •    Sum-product algorithm
          8. GRAPHICAL MODELS

     Figure 8.50   The sum-product algorithm can be viewed
                   purely in terms of messages sent out by factor
                   nodes to other factor nodes. In this example,
                   the outgoing message shown by the blue arrow
                   is obtained by taking the product of all the in-      x1
                   coming messages shown by green arrows, mul-
                   tiplying by the factor fs , and marginalizing over                                x3
                   the variables x1 and x2 .                             x2       fs




                   and indeed the notion of one node having a special status was introduced only as a
•    Normalization Inference in Graphical Models
                   8.4. (undirected graph)                                                        409

      • totoget normalization coefficient 1/Z p(x) = p~(x)/Z
graph used illustrate the x               x 1              x               2                           3

      • use sum-product to findfunnormalized marginals for xi
 orithm.
                                                  f
      • coefficient 1/Z can be obtained by normalizing the marginal
                                                           a                          b



      •                                     f
         efficient as calculated only over one single variable                  c
                                                               8.4. Inference in Graphical Models          409

        Figure 8.51   A simple factor graph used to illustrate the   x1                x2                  x3
                      sum-product algorithm.
                                                                               fa                 fb
                                                                          x4
                                                                                            fc


nnormalized joint distribution is given by
                                                                                       x4
          p(x) = fa (x1 , x2 )fb (x2 , x3 )fc (x2 , x4 ).                                        (8.73)
                                                                     F 8.73
                      graph whose unnormalized joint distribution is given by
                                                                         Unnormalized joint distributions
ply the sum-product algorithm to this graph, let us designate node x3
which case there are two leaf nodes fa1 1 , x2 )fb (x.2 ,Startingxwith the leaf (8.73)
                              p(x) = x(xand x
                                                   4
                                                         x3 )fc (x2 , 4 ).
410         8.4.8. GRAPHICAL MODELS
                              Inference in Graphical Models                        409            p(x) = fa (x1 , x2 )fb (x2 , x3 )fc (x2 , x4 ).                (8.73)

 r graph used to illustrate the
                     x1            x1             x2       x2         In order to3apply thex
                                                                          x3 x              sum-product algorithm to this graph, let us x
                                                                                                                 x2                     designate node x3
                                                                                             1                                           3
algorithm.                                                           as the root, in which case there are two leaf nodes x1 and x4 . Starting with the leaf
                                                 fa
                                                                     nodes, we then have the following sequence of six messages
                                                                     f   b

                                                                                                µx1 →fa (x1 ) = 1                                                (8.74)
                                                                fc
                                                                                                µfa →x2 (x2 ) =             fa (x1 , x2 )                        (8.75)
                                                                                                                       x1
                                                                                                µx4 →fc (x4 ) = 1                                                (8.76)
                                                           x4                                   µfc →x2 (x2 ) =             fc (x2 , x4 )                        (8.77)
                                                                                                                       x4
                                                  x4                                                                x4
                                                                                                µx2 →fb (x2 ) = µfa →x2 (x2 )µfc →x2 (x2 )                       (8.78)
 unnormalized joint distribution is given by
                                         (a)                                                    µ       (x ) =      (b) , x )µ
                                                                                                                    f (x               .                         (8.79)
                                                                                                  fb →x3    3                b   2    3     x2 →fb
          p(x) = fa (x1 , x2 )fb (x2 , x3 )fc (x2 , x4 ).         (8.73)                            x2
            Figure 8.52 Flow of messages for the sum-product algorithm applied to the example graph in Figure 8.51. (a)
 pply the sum-productleaf nodesto 1 and x4 towards theThe direction 3 . (b) From the messages istowards the leaf nodes. Once this mes-
            From the algorithm x this graph, let us designate node xx of flow of these root node illustrated in Figure 8.52.
                                                           root node
                                                                        3
n which case there are two leaf nodes x1 and x4 . Startingsage propagation is complete, we can then propagate messages from the root node
                                                           with the leaf
en have the following sequence of six messages            out to the leaf nodes, and these are given by
                                             One message has now passed in each direction across each link, and we can now
         µx1 →fa (x1 ) = 1                   evaluate the marginals. As a simplex3 →fb (x3 ) = verify that the marginal p(x2 ) is (8.80)
                                                                       (8.74)   µ check, let us 1
         µfa →x2 (x2 ) =               fa (x1given by the correct expression. Using→x2 (x2 )and substitutingxfor the messages using (8.81)
                                              , x2 )                   (8.75)   µfb (8.63) =          fb (x2 , 3 )
                                  x1         the above results, we have                           x3

         µx4 →fc (x4 ) = 1                                                        (8.76)      µx2 →fa (x2 ) = µfb →x2 (x2 )µfc →x2 (x2 )                         (8.82)
                                                         p(x2 ) = µfa →x2 (x2 )µfb →x2 (x2 )µfc →x2 (x2 )
         µfc →x2 (x2 ) =               fc (x2 , x4 )                              (8.77)      µfa →x1 (x1 ) =             fa (x1 , x2 )µx2 →fa (x2 )             (8.83)
                                                                                                                     x2
                                                                     =             fa (x1 , xµ)             fb (x2 ,µ 3 )             fc (x2 , x4 )
                                  x4
                                                                                                                    x
                                                                                                x2 →fc (x2 )   =      fa →x2 (x2 )µfb →x2 (x2 )
                                                                                             2                                                                   (8.84)
         µx2 →fb (x2 ) = µfa →x2 (x2 )µfc →x2 (x2 )                               (8.78)
                                                                              x1                       x3                        x4
         µfb →x3 (x3 ) =               fb (x2 , x3 )µx2 →fb .                     (8.79)      µfc →x4 (x4 ) =             fc (x2 , x4 )µx2 →fc (x2 ).            (8.85)
                                  x2
                                                                     =                       fa (x1 , x2 )fb (x2 , xx2)fc (x2 , x4 )
                                                                                                                    3
                                                                             x1    x2   x4
n of flow of these messages is illustrated in Figure 8.52. Once this mes-
                                                         =
ation is complete, we can then propagate messages from the root node p(x)                                                                               (8.86)
 f nodes, and these are given by                              x1 x3 x4

       µx3 →fb (x3 ) = 1                     as required.                         (8.80)
af nodes x1 and x4 towards the root node x3 . (b) From the root node towards the leaf nodes.


           One message has now passed in each direction across each link, and we can now
           evaluate the marginals. As a simple check, let us verify that the marginal p(x2 ) is
          •given by the correct expression. Usingcalculated
               Marginal p(x2) can be (8.63) and substituting for the messages using
           the above results, we have

                    p(x2 ) = µfa →x2 (x2 )µfb →x2 (x2 )µfc →x2 (x2 )

                             =           fa (x1 , x2 )           fb (x2 , x3 )          fc (x2 , x4 )
                                    x1                      x3                     x4

                             =                     fa (x1 , x2 )fb (x2 , x3 )fc (x2 , x4 )
                                  x1     x2   x4

                             =                     p(x)                                                 (8.86)
                                  x1     x3   x4

           as required.
                So far, we have assumed that all of the variables in the graph are hidden. In most
           practical applications, a subset of the variables will be observed, and we wish to cal-
           culate posterior distributions conditioned on these observations. Observed nodes are
           easily handled within the sum-product algorithm as follows. Suppose we partition x
           into hidden variables h and observed variables v, and that the observed value of v
           is denoted v. Then we simply multiply the joint distribution p(x) by i I(vi , vi ),
                                                references @n_shuyo product corresponds
           where I(v, v) = 1 if v = v and I(v, v) = 0 otherwise. This @sleepy_yoshi @nokuno
           to p(h, v = v) and hence is an unnormalized version of p(h|v = v). By run-
           ning the sum-product algorithm, we can efficiently calculate the posterior marginals
           p(hi |v = v) up to a normalization coefficient whose value can be found efficiently

Más contenido relacionado

La actualidad más candente

Introduction to matlab
Introduction to matlabIntroduction to matlab
Introduction to matlabkrishna_093
 
Applied numerical methods lec9
Applied numerical methods lec9Applied numerical methods lec9
Applied numerical methods lec9Yasser Ahmed
 
Intro probability 2
Intro probability 2Intro probability 2
Intro probability 2Phong Vo
 
Intro probability 3
Intro probability 3Intro probability 3
Intro probability 3Phong Vo
 
Understanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to PracticeUnderstanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to PracticeGilles Louppe
 
Matlab ploting
Matlab plotingMatlab ploting
Matlab plotingAmeen San
 
Bias-variance decomposition in Random Forests
Bias-variance decomposition in Random ForestsBias-variance decomposition in Random Forests
Bias-variance decomposition in Random ForestsGilles Louppe
 
"Automatic Variational Inference in Stan" NIPS2015_yomi2016-01-20
"Automatic Variational Inference in Stan" NIPS2015_yomi2016-01-20"Automatic Variational Inference in Stan" NIPS2015_yomi2016-01-20
"Automatic Variational Inference in Stan" NIPS2015_yomi2016-01-20Yuta Kashino
 
Understanding variable importances in forests of randomized trees
Understanding variable importances in forests of randomized treesUnderstanding variable importances in forests of randomized trees
Understanding variable importances in forests of randomized treesGilles Louppe
 
Lesson 1: Functions and their representations (slides)
Lesson 1: Functions and their representations (slides)Lesson 1: Functions and their representations (slides)
Lesson 1: Functions and their representations (slides)Matthew Leingang
 
Newton divided difference interpolation
Newton divided difference interpolationNewton divided difference interpolation
Newton divided difference interpolationVISHAL DONGA
 
Langrange Interpolation Polynomials
Langrange Interpolation PolynomialsLangrange Interpolation Polynomials
Langrange Interpolation PolynomialsSohaib H. Khan
 
從 VAE 走向深度學習新理論
從 VAE 走向深度學習新理論從 VAE 走向深度學習新理論
從 VAE 走向深度學習新理論岳華 杜
 
ABC with data cloning for MLE in state space models
ABC with data cloning for MLE in state space modelsABC with data cloning for MLE in state space models
ABC with data cloning for MLE in state space modelsUmberto Picchini
 

La actualidad más candente (20)

Nl eqn lab
Nl eqn labNl eqn lab
Nl eqn lab
 
Introduction to matlab
Introduction to matlabIntroduction to matlab
Introduction to matlab
 
Applied numerical methods lec9
Applied numerical methods lec9Applied numerical methods lec9
Applied numerical methods lec9
 
Intro probability 2
Intro probability 2Intro probability 2
Intro probability 2
 
Intro probability 3
Intro probability 3Intro probability 3
Intro probability 3
 
Understanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to PracticeUnderstanding Random Forests: From Theory to Practice
Understanding Random Forests: From Theory to Practice
 
Lecture 1
Lecture 1Lecture 1
Lecture 1
 
Matlab ploting
Matlab plotingMatlab ploting
Matlab ploting
 
Bias-variance decomposition in Random Forests
Bias-variance decomposition in Random ForestsBias-variance decomposition in Random Forests
Bias-variance decomposition in Random Forests
 
"Automatic Variational Inference in Stan" NIPS2015_yomi2016-01-20
"Automatic Variational Inference in Stan" NIPS2015_yomi2016-01-20"Automatic Variational Inference in Stan" NIPS2015_yomi2016-01-20
"Automatic Variational Inference in Stan" NIPS2015_yomi2016-01-20
 
Session 6
Session 6Session 6
Session 6
 
Understanding variable importances in forests of randomized trees
Understanding variable importances in forests of randomized treesUnderstanding variable importances in forests of randomized trees
Understanding variable importances in forests of randomized trees
 
Lesson 1: Functions and their representations (slides)
Lesson 1: Functions and their representations (slides)Lesson 1: Functions and their representations (slides)
Lesson 1: Functions and their representations (slides)
 
Newton divided difference interpolation
Newton divided difference interpolationNewton divided difference interpolation
Newton divided difference interpolation
 
Pc12 sol c03_cp
Pc12 sol c03_cpPc12 sol c03_cp
Pc12 sol c03_cp
 
Langrange Interpolation Polynomials
Langrange Interpolation PolynomialsLangrange Interpolation Polynomials
Langrange Interpolation Polynomials
 
從 VAE 走向深度學習新理論
從 VAE 走向深度學習新理論從 VAE 走向深度學習新理論
從 VAE 走向深度學習新理論
 
Bayesian Core: Chapter 6
Bayesian Core: Chapter 6Bayesian Core: Chapter 6
Bayesian Core: Chapter 6
 
ABC with data cloning for MLE in state space models
ABC with data cloning for MLE in state space modelsABC with data cloning for MLE in state space models
ABC with data cloning for MLE in state space models
 
Fourier series
Fourier seriesFourier series
Fourier series
 

Similar a The Sum-Product Algorithm for Exact Inference on Tree Graphs

02-Random Variables.ppt
02-Random Variables.ppt02-Random Variables.ppt
02-Random Variables.pptAkliluAyele3
 
(DL hacks輪読) Deep Kernel Learning
(DL hacks輪読) Deep Kernel Learning(DL hacks輪読) Deep Kernel Learning
(DL hacks輪読) Deep Kernel LearningMasahiro Suzuki
 
this materials is useful for the students who studying masters level in elect...
this materials is useful for the students who studying masters level in elect...this materials is useful for the students who studying masters level in elect...
this materials is useful for the students who studying masters level in elect...BhojRajAdhikari5
 
Module 2 lesson 4 notes
Module 2 lesson 4 notesModule 2 lesson 4 notes
Module 2 lesson 4 notestoni dimella
 
Matlab cheatsheet
Matlab cheatsheetMatlab cheatsheet
Matlab cheatsheetlokeshkumer
 
Moment-Generating Functions and Reproductive Properties of Distributions
Moment-Generating Functions and Reproductive Properties of DistributionsMoment-Generating Functions and Reproductive Properties of Distributions
Moment-Generating Functions and Reproductive Properties of DistributionsIJSRED
 
IJSRED-V2I5P56
IJSRED-V2I5P56IJSRED-V2I5P56
IJSRED-V2I5P56IJSRED
 
Reformulation of Nash Equilibrium with an Application to Interchangeability
Reformulation of Nash Equilibrium with an Application to InterchangeabilityReformulation of Nash Equilibrium with an Application to Interchangeability
Reformulation of Nash Equilibrium with an Application to InterchangeabilityYosuke YASUDA
 
On Clustering Financial Time Series - Beyond Correlation
On Clustering Financial Time Series - Beyond CorrelationOn Clustering Financial Time Series - Beyond Correlation
On Clustering Financial Time Series - Beyond CorrelationGautier Marti
 
Boolean Matching in Logic Synthesis
Boolean Matching in Logic SynthesisBoolean Matching in Logic Synthesis
Boolean Matching in Logic SynthesisIffat Anjum
 
Numarical values
Numarical valuesNumarical values
Numarical valuesAmanSaeed11
 
Numarical values highlighted
Numarical values highlightedNumarical values highlighted
Numarical values highlightedAmanSaeed11
 
R command cheatsheet.pdf
R command cheatsheet.pdfR command cheatsheet.pdf
R command cheatsheet.pdfNgcnh947953
 
Short Reference Card for R users.
Short Reference Card for R users.Short Reference Card for R users.
Short Reference Card for R users.Dr. Volkan OBAN
 
Random variable, distributive function lect3a.ppt
Random variable, distributive function lect3a.pptRandom variable, distributive function lect3a.ppt
Random variable, distributive function lect3a.pptsadafshahbaz7777
 

Similar a The Sum-Product Algorithm for Exact Inference on Tree Graphs (20)

02-Random Variables.ppt
02-Random Variables.ppt02-Random Variables.ppt
02-Random Variables.ppt
 
(DL hacks輪読) Deep Kernel Learning
(DL hacks輪読) Deep Kernel Learning(DL hacks輪読) Deep Kernel Learning
(DL hacks輪読) Deep Kernel Learning
 
this materials is useful for the students who studying masters level in elect...
this materials is useful for the students who studying masters level in elect...this materials is useful for the students who studying masters level in elect...
this materials is useful for the students who studying masters level in elect...
 
Module 2 lesson 4 notes
Module 2 lesson 4 notesModule 2 lesson 4 notes
Module 2 lesson 4 notes
 
Matlab cheatsheet
Matlab cheatsheetMatlab cheatsheet
Matlab cheatsheet
 
Moment-Generating Functions and Reproductive Properties of Distributions
Moment-Generating Functions and Reproductive Properties of DistributionsMoment-Generating Functions and Reproductive Properties of Distributions
Moment-Generating Functions and Reproductive Properties of Distributions
 
IJSRED-V2I5P56
IJSRED-V2I5P56IJSRED-V2I5P56
IJSRED-V2I5P56
 
Reformulation of Nash Equilibrium with an Application to Interchangeability
Reformulation of Nash Equilibrium with an Application to InterchangeabilityReformulation of Nash Equilibrium with an Application to Interchangeability
Reformulation of Nash Equilibrium with an Application to Interchangeability
 
On Clustering Financial Time Series - Beyond Correlation
On Clustering Financial Time Series - Beyond CorrelationOn Clustering Financial Time Series - Beyond Correlation
On Clustering Financial Time Series - Beyond Correlation
 
Boolean Matching in Logic Synthesis
Boolean Matching in Logic SynthesisBoolean Matching in Logic Synthesis
Boolean Matching in Logic Synthesis
 
Numarical values
Numarical valuesNumarical values
Numarical values
 
Numarical values highlighted
Numarical values highlightedNumarical values highlighted
Numarical values highlighted
 
R command cheatsheet.pdf
R command cheatsheet.pdfR command cheatsheet.pdf
R command cheatsheet.pdf
 
@ R reference
@ R reference@ R reference
@ R reference
 
Reference card for R
Reference card for RReference card for R
Reference card for R
 
Short Reference Card for R users.
Short Reference Card for R users.Short Reference Card for R users.
Short Reference Card for R users.
 
1807.02591v3.pdf
1807.02591v3.pdf1807.02591v3.pdf
1807.02591v3.pdf
 
Introduction to R
Introduction to RIntroduction to R
Introduction to R
 
Random variable, distributive function lect3a.ppt
Random variable, distributive function lect3a.pptRandom variable, distributive function lect3a.ppt
Random variable, distributive function lect3a.ppt
 
Statistics lab 1
Statistics lab 1Statistics lab 1
Statistics lab 1
 

Último

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 

Último (20)

The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 

The Sum-Product Algorithm for Exact Inference on Tree Graphs

  • 1. The Sum-Product Algorithm PRML 8.4.4 Xuebin Ma
  • 2. Factor graph • undirected tree, directed tree, ploy tree (F8.43) • Goal: • Obtain an efficient, exact inference algorithm for finding marginals • Compute efficiently where several marginals are require
  • 3. den. Later we shall see how to modify p(x) algorithm to incorporate evidenc p(x) = the (8.61) onding to observed variables. By definition, the marginal is obtained by sum • x denotes the set of variables in x with variable x omitted. The idea is xx he joint distribution over all variables except x so that node x Calculate marginals for particular variable where x to substitute for p(x) using the factor graph expression (8.59) and then interchange p(x) = summations and products in order to obtain p(x) an efficient algorithm. Consider the (8.61 fragment of graph shown in Figure 8.46xin which we see that8.61 tree structure of x F the the graph allows us to partitiondistribution overthe variables except x into groups, with sum the joint the factors in all joint distribution x x group associated with each of the factor x with variable x omitted. The idea one denotes the set of variables in nodes that is a neighbour of the variable • node x. We see using the factor graph expression (8.59)factorsform titute for p(x)Joint distribution in form a production of andthe interchang that the joint distribution can be written as a product of then tions and products in order to obtain san Xs ) p(x) = F (x, efficient algorithm. Consider th (8.62) nt 404 graph shown in Figure 8.46ne(x) which we see 8.62 the tree structure of 8. GRAPHICAL MODELS s∈ in F that ph ne(x) denotes the set of factor nodes that in the joint distribution into groups, wi allows us to partition the factors are neighbours of x, and X denotes the s upset of all variables in the subtree connected to the variable node x via the factor node associated evaluation of the marginal p(x). Figure 8.46 with each of graph illustrating the A fragment of a factor the factor nodes that is a neighbour of the variab We see that the joint distribution can be written as a product of the form µ (x) fs →x Fs (x, Xs ) p(x) = Fs (x, Xs ) fs x (8.62 s∈ne(x) denotes the set of factor nodes that are neighbours of x, and Xs denotes th ll variables in theand Fs(x, Xs)connected to theall the factors innode x associated factor nod fs , subtree represents the product of variable the group via the
  • 4. x, Xs ) represents the product of all the factors in the group associated g. fs(8.62) into (8.61) and interchanging the sums and products, we ob- • uting (8.62) into and product sum (8.61) and interchanging the sums and products, we ob- p(x) = Fs (x, Xs ) p(x) = ne(x) s∈ Xs Fs (x, Xs ) F 8.61, F8.62 -> F8.63 s∈ne(x) Xs = µfs →x (x). (8.63) = ne(x) s∈ µfs →x (x). (8.63) s∈ne(x) ntroduced a set of functions µfs →x (x), defined by ve introduced a set of functions µfs →x (x), defined by µfs →x (x) ≡ Fs (x, Xs ) (8.64) µfs →x (x)Xs ≡ Fs (x, Xs ) F 8.64 (8.64) Xs iewed as messages8.63 message from factor node to to variable node x x. F from the factor nodes fs fx the variable node be viewed marginal p(x)from the by the nodes fsof all the incoming x. required as messages is given factor product to the variable node ng atrequired marginalproductis given by the product of all the incoming the node x. F 8.64 p(x) of all incoming messages arriving at node x riving at these x. evaluate node messages, we again turn to Figure 8.46 and note that rx, Xs ) is describedmessages, we again turnandFigure 8.46 and fac- that to evaluate these by a factor (sub-)graph to so can itself be note cular, we can write Fs (x, Xs ) is described by a factor (sub-)graph and so can itself be fac- particular, we, can, write (x , X ) . . . G (x , X ) ) = f (x, x . . . x )G s 1 M 1 1 s1 M M sM (8.65) enience,fwe have .denoted)G1 variables )associated with factor fx , in Xs ) = s (x, x1 , . . , xM the (x1 , Xs1 . . . GM (xM , XsM ) (8.65)
  • 5. messages arriving at node x. a set of functions µfs →x (x), defined by Here we have introduced In order to evaluate these messages, we again s f turn to Figure 8.46 and note that • Evaluate is describeds →x (x) µf by a factor Fs (x, Xs ) each factor Fs (x, Xs )these messages≡ (sub-)graph and so can x torized. In particular, we can write Xs µfs →x (x) (8.64) itself be fac- which can ) =viewedxas . . . , xM )G1 (x1the s1 ) . . . nodes fM to the variable node x. Fs (x, Xs be fs (x, 1 , messagesxfrom , Xfactor GM (x s , XsM ) (8.65) m We see that the required marginal p(x) is given by the product Models 405 8.4. Inference in Graphicalof all the incoming where, for convenience, we have denoted the X ) messages arriving at node x. Gm (xm , variables associated with factor fx , in Figure 8.47 x , . . . x the factorization of the sm as- x addition to x, by Illustration, of Mthese messages, subgraphillustratedFigure 8.468.47. note that . This factorizationagain turn M in Figure and Note In order sociated with factor node fs . to1 evaluate we is to µx →f (xM ) that the set of variables {x,is 1 , . . . , xM byis the set (sub-)graph and so can itself be fac- each factor Fs (x, Xs ) xdescribed } a factor of variables on which the factor M s fs depends, andparticular,alsocan denoted xs , using the notation of (8.59). torized. In so it can we be write fs denotes the set of(8.65) into (8.64)that are neighbours of the factor node Substituting variable nodes we obtain x Fs (x, Xs ) = fs (x, x1 , . . . , xM )G1 (x1 , Xs1 ) . . . GM (xM , XsM )→x (x) (8.65) s) x denotes the same set but with node x removed. Here we have µf s ollowing messages from. variable have denoted the nodes xassociated (x , factor) fx , in µ where, for convenience, we(x, x , . to ,factor variables m (x) = . . f nodes . . x ) G with X fs →x s 1 M m m sm addition to x, by x1 , . . . , xM . This factorization is illustrated in Figure 8.47. Note x1 xM m∈ne(fs )Gm (xmxm sm ) x X , X that µxmset of variables {x,Gm (x. , , Xsm ).the set of variables(8.67) the →fs (xm ) ≡ x1 , . . m xM } is on which the factor = . . Xsm be , . . . , x x) fs depends, and so.it can fs (x, x1denotedM s , usingF 8.67 xm →fsof (8.59). (8.66) also the notation (xm ) µ Substituting (8.65))into (8.64)set ofobtainm∈ne(fthat are neighbours of the factor node where ne(f Mdenotes the x1 x we variable nodes s )x s efore introduced two, distinctskinds of message, those that go from factor Here we have fs and ne(f ) x denotes the same set but with node x removed. µfs denoted= f →x (x), andfmessages from x from variable nodes to able nodes→x (x) defined the .following those 1 , . . .go M ) nodes to factor nodes (xm , Xsm ) µ .. s (x, x that , variable Gm denoted µx→f (x). In each case, we see that )messages(x s )x Xxm a x1 xM µx →f (xm ≡ Gm m passed along m∈ne(f , Xsm ). (8.67) ys a function of the variable associated with the variable node that link m s = fs (x, x1 , . . . , xM ) µxm →fs (xm ) X ... sm (8.66) We have 1 x therefore introduced two distinct kinds ne(message, those that go from factor xM m∈ of fs )x t (8.66) says that to evaluate thenodes denoted µf →x (x),factor node to a vari- nodes to variable message sent by a and those that go from variable nodes to ng the link connecting them,denoted µx→f (x). In of the incoming messagespassed along a factor nodes take the product each case, we see that messages link are always a function of the variable associated with the variable node that link
  • 6. always a function of the variable associated with the variable node that link s to. e result (8.66) says that to evaluate the message sent by a factor node to a vari- • de along the link connecting them, take the product of thefrom variable to factor CAL MODELS Evaluate messages from messages incoming messages ll other linksusing sub-graph factorization the factor associated by coming into the factor node, multiply by at node, and then marginalize over all of the variables associated with the ng406 of the8. GRAPHICAL MODELS sent by a It fL important to note that messages. evaluationillustrated in Figure 8.47. is stration This is of the message able node to an adjacent factor node. node can send a message to a variable node once it has received incoming Figure 8.48 Illustration of variable nodes. es from all other neighbouring the evaluation of the message sent by a fL variable node to an adjacent factor node. ally, we derive an expression for evaluating the messages from variable nodes r nodes, again by making use of the (sub-)graph factorization. From Fig- s xm f 8, we see that term Gm (xm , Xsm ) associated with node xm is given by a fs of terms Fl (xm , Xml ) each associated with one of the factor nodes fl that is xm o node xm (excluding node fs ), so that fl fl Fl (xm , Xml ) Gm (xm , Xsm ) = Fl (xm , Xml ) Fl (xm(8.68) , Xml ) l∈ne(xm )fs F 8.68 n obtain he product is taken overobtain then all neighbours of node xm except for node fs .Xm except for node fs product of node Note ch of the factors Fl (xm , Xml ) represents a subtree of the original graph of y the same µxm →fs (xm ) = in xm →fs (xm ) = Fl (xm (8.68)) into l(8.67),ml ) kind as introduced µ (8.62). Substituting , Xml F (xm , X we l∈ne(xm )fs Xml l∈ne(xm )fs Xml = = µfl →xm (xm ) µfl →xm (xm ) (8.69) (8.69) l∈ne(xm )fs l∈ne(xm )fs F 8.67 + F 8.68 -> F 8.69 where we have used the definition (8.64) of the messages passed from factor nodes to ere we have used the definition (8.64) of the messages passed from factor nodes to
  • 7. from (8.66) that the message sent should take the form • Message send by leaf(variable fnode = f (x)factor node) µ →x (x) and (8.71) Figure 8.49 The sum-product algorithm µx→f (x) = 1 µf →x (x) = f (x) begins with messages sent by the leaf nodes, which de- pend on whether the leaf x f f x node is (a) a variable node, (a) (b) or (b) a factor node. • Find marginals for every variable node introduced by John-san 08 • Sum-product algorithm 8. GRAPHICAL MODELS Figure 8.50 The sum-product algorithm can be viewed purely in terms of messages sent out by factor nodes to other factor nodes. In this example, the outgoing message shown by the blue arrow is obtained by taking the product of all the in- x1 coming messages shown by green arrows, mul- tiplying by the factor fs , and marginalizing over x3 the variables x1 and x2 . x2 fs and indeed the notion of one node having a special status was introduced only as a
  • 8. Normalization Inference in Graphical Models 8.4. (undirected graph) 409 • totoget normalization coefficient 1/Z p(x) = p~(x)/Z graph used illustrate the x x 1 x 2 3 • use sum-product to findfunnormalized marginals for xi orithm. f • coefficient 1/Z can be obtained by normalizing the marginal a b • f efficient as calculated only over one single variable c 8.4. Inference in Graphical Models 409 Figure 8.51 A simple factor graph used to illustrate the x1 x2 x3 sum-product algorithm. fa fb x4 fc nnormalized joint distribution is given by x4 p(x) = fa (x1 , x2 )fb (x2 , x3 )fc (x2 , x4 ). (8.73) F 8.73 graph whose unnormalized joint distribution is given by Unnormalized joint distributions ply the sum-product algorithm to this graph, let us designate node x3 which case there are two leaf nodes fa1 1 , x2 )fb (x.2 ,Startingxwith the leaf (8.73) p(x) = x(xand x 4 x3 )fc (x2 , 4 ).
  • 9. 410 8.4.8. GRAPHICAL MODELS Inference in Graphical Models 409 p(x) = fa (x1 , x2 )fb (x2 , x3 )fc (x2 , x4 ). (8.73) r graph used to illustrate the x1 x1 x2 x2 In order to3apply thex x3 x sum-product algorithm to this graph, let us x x2 designate node x3 1 3 algorithm. as the root, in which case there are two leaf nodes x1 and x4 . Starting with the leaf fa nodes, we then have the following sequence of six messages f b µx1 →fa (x1 ) = 1 (8.74) fc µfa →x2 (x2 ) = fa (x1 , x2 ) (8.75) x1 µx4 →fc (x4 ) = 1 (8.76) x4 µfc →x2 (x2 ) = fc (x2 , x4 ) (8.77) x4 x4 x4 µx2 →fb (x2 ) = µfa →x2 (x2 )µfc →x2 (x2 ) (8.78) unnormalized joint distribution is given by (a) µ (x ) = (b) , x )µ f (x . (8.79) fb →x3 3 b 2 3 x2 →fb p(x) = fa (x1 , x2 )fb (x2 , x3 )fc (x2 , x4 ). (8.73) x2 Figure 8.52 Flow of messages for the sum-product algorithm applied to the example graph in Figure 8.51. (a) pply the sum-productleaf nodesto 1 and x4 towards theThe direction 3 . (b) From the messages istowards the leaf nodes. Once this mes- From the algorithm x this graph, let us designate node xx of flow of these root node illustrated in Figure 8.52. root node 3 n which case there are two leaf nodes x1 and x4 . Startingsage propagation is complete, we can then propagate messages from the root node with the leaf en have the following sequence of six messages out to the leaf nodes, and these are given by One message has now passed in each direction across each link, and we can now µx1 →fa (x1 ) = 1 evaluate the marginals. As a simplex3 →fb (x3 ) = verify that the marginal p(x2 ) is (8.80) (8.74) µ check, let us 1 µfa →x2 (x2 ) = fa (x1given by the correct expression. Using→x2 (x2 )and substitutingxfor the messages using (8.81) , x2 ) (8.75) µfb (8.63) = fb (x2 , 3 ) x1 the above results, we have x3 µx4 →fc (x4 ) = 1 (8.76) µx2 →fa (x2 ) = µfb →x2 (x2 )µfc →x2 (x2 ) (8.82) p(x2 ) = µfa →x2 (x2 )µfb →x2 (x2 )µfc →x2 (x2 ) µfc →x2 (x2 ) = fc (x2 , x4 ) (8.77) µfa →x1 (x1 ) = fa (x1 , x2 )µx2 →fa (x2 ) (8.83) x2 = fa (x1 , xµ) fb (x2 ,µ 3 ) fc (x2 , x4 ) x4 x x2 →fc (x2 ) = fa →x2 (x2 )µfb →x2 (x2 ) 2 (8.84) µx2 →fb (x2 ) = µfa →x2 (x2 )µfc →x2 (x2 ) (8.78) x1 x3 x4 µfb →x3 (x3 ) = fb (x2 , x3 )µx2 →fb . (8.79) µfc →x4 (x4 ) = fc (x2 , x4 )µx2 →fc (x2 ). (8.85) x2 = fa (x1 , x2 )fb (x2 , xx2)fc (x2 , x4 ) 3 x1 x2 x4 n of flow of these messages is illustrated in Figure 8.52. Once this mes- = ation is complete, we can then propagate messages from the root node p(x) (8.86) f nodes, and these are given by x1 x3 x4 µx3 →fb (x3 ) = 1 as required. (8.80)
  • 10. af nodes x1 and x4 towards the root node x3 . (b) From the root node towards the leaf nodes. One message has now passed in each direction across each link, and we can now evaluate the marginals. As a simple check, let us verify that the marginal p(x2 ) is •given by the correct expression. Usingcalculated Marginal p(x2) can be (8.63) and substituting for the messages using the above results, we have p(x2 ) = µfa →x2 (x2 )µfb →x2 (x2 )µfc →x2 (x2 ) = fa (x1 , x2 ) fb (x2 , x3 ) fc (x2 , x4 ) x1 x3 x4 = fa (x1 , x2 )fb (x2 , x3 )fc (x2 , x4 ) x1 x2 x4 = p(x) (8.86) x1 x3 x4 as required. So far, we have assumed that all of the variables in the graph are hidden. In most practical applications, a subset of the variables will be observed, and we wish to cal- culate posterior distributions conditioned on these observations. Observed nodes are easily handled within the sum-product algorithm as follows. Suppose we partition x into hidden variables h and observed variables v, and that the observed value of v is denoted v. Then we simply multiply the joint distribution p(x) by i I(vi , vi ), references @n_shuyo product corresponds where I(v, v) = 1 if v = v and I(v, v) = 0 otherwise. This @sleepy_yoshi @nokuno to p(h, v = v) and hence is an unnormalized version of p(h|v = v). By run- ning the sum-product algorithm, we can efficiently calculate the posterior marginals p(hi |v = v) up to a normalization coefficient whose value can be found efficiently