SlideShare a Scribd company logo
1 of 67
Download to read offline
Prediction Problem                                            G: Generality     G: Optimality




                      Part 6: Structured Prediction and Energy
                                 Minimization (1/2)

                                 Sebastian Nowozin and Christoph H. Lampert



                                             Colorado Springs, 25th June 2011




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality       G: Optimality

Prediction Problem



Prediction Problem



                                                 y ∗ = f (x) = argmax g (x, y )
                                                                   y ∈Y


                 g (x, y ) = p(y |x), factor graphs/MRF/CRF,
                 g (x, y ) = −E (y ; x, w ), factor graphs/MRF/CRF,
                 g (x, y ) = w , ψ(x, y ) , linear model (e.g. multiclass SVM),

        → difficulty: Y finite but very large




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality       G: Optimality

Prediction Problem



Prediction Problem



                                                 y ∗ = f (x) = argmax g (x, y )
                                                                   y ∈Y


                 g (x, y ) = p(y |x), factor graphs/MRF/CRF,
                 g (x, y ) = −E (y ; x, w ), factor graphs/MRF/CRF,
                 g (x, y ) = w , ψ(x, y ) , linear model (e.g. multiclass SVM),

        → difficulty: Y finite but very large




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                             G: Generality     G: Optimality

Prediction Problem



Prediction Prblem (cont)



        Definition (Optimization Problem)
        Given (g , Y, G, x), with feasible set Y ⊆ G over decision domain G, and
        given an input instance x ∈ X and an objective function g : X × G → R,
        find the optimal value
                                       α = sup g (x, y ),
                                                              y ∈Y

        and, if the supremum exists, find an optimal solution y ∗ ∈ Y such that
        g (x, y ∗ ) = α.




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality     G: Optimality

Prediction Problem



The feasible set

        Ingredients
                 Decision domain G,
                 typically simple (G = Rd , G = 2V , etc.)
                 Feasible set Y ⊆ G,
                 defining the problem-specific structure
                 Objective function g : X × G → R.

        Terminology
                 Y = G: unconstrained optimization problem,
                 G finite: discrete optimization problem,
                 G = 2Σ for ground set Σ: combinatorial optimization problem,
                 Y = ∅: infeasible problem.


Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality                   G: Optimality

Prediction Problem



Example: Feasible Sets (cont)
                                                  Jij yi yj               Jjk yj yk
                                       Yi                      Yj                     Yk

                                      (+1)                    (−1)                    (−1)
                                      hi yi                   hj yj                   hk yk

                 Ising model with external field
                 Graph G = (V , E )
                 “External field”: h ∈ RV
                 Interaction matrix: J ∈ RV ×V
                 Objective, defined on yi ∈ {−1, 1}
                                                                   1           1
                                   g (y ) = hi yi + hj yj + hk yk + Jij yi yj + Jjk yj yk
                                                                   2           2
Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                              G: Generality                 G: Optimality

Prediction Problem



Example: Feasible Sets (cont)



        Ising model with external field

                                                 Y = G = {−1, +1}V
                                                          1
                                                 g (y ) =      Ji,j yi yj +           hi yi
                                                          2
                                                              (i,j)∈E           i∈V


                 Unconstrained
                 Objective function contains quadratic terms




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                                   G: Generality                  G: Optimality

Prediction Problem



Example: Feasible Sets (cont)

                                   G = {0, 1}(V ×{−1,+1})∪(E ×{−1,+1}×{−1,+1}) ,
                                   Y = {y ∈ G : ∀i ∈ V : yi,−1 + yi,+1 = 1,
                                             ∀(i, j) ∈ E : yi,j,+1,+1 + yi,j,+1,−1 = yi,+1 ,
                                         ∀(i, j) ∈ E : yi,j,−1,+1 + yi,j,−1,−1 = yi,−1 },
                                            1
                                   g (y ) =          Ji,j (yi,j,+1,+1 + yi,j,−1,−1 )
                                            2
                                                     (i,j)∈E
                                                  1
                                                −                 Ji,j (yi,j,+1,−1 + yi,j,−1,+1 )
                                                  2
                                                        (i,j)∈E

                                                +             hi (yi,+1 − yi,−1 )
                                                    i∈V


                 Constrained, more variables
                 Objective function contains linear terms only
Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                             G: Generality      G: Optimality

Prediction Problem



Evaluating f : what do we want?
                                                      f (x) = argmax g (x, y )
                                                                y ∈Y



        For evaluating f (x) we want an algorithm that
           1. is general: applicable to all instances of the problem,
           2. is optimal: provides an optimal y ∗ ,
           3. has good worst-case complexity: for all instances the runtime and
              space is acceptably bounded,
           4. is integral: its solutions are restricted to Y,
           5. is deterministic: its results and runtime are reproducible and depend
              on the input data only.




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                             G: Generality      G: Optimality

Prediction Problem



Evaluating f : what do we want?
                                                      f (x) = argmax g (x, y )
                                                                y ∈Y



        For evaluating f (x) we want an algorithm that
           1. is general: applicable to all instances of the problem,
           2. is optimal: provides an optimal y ∗ ,
           3. has good worst-case complexity: for all instances the runtime and
              space is acceptably bounded,
           4. is integral: its solutions are restricted to Y,
           5. is deterministic: its results and runtime are reproducible and depend
              on the input data only.

                                             wanting all of them → impossible

Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                                 G: Generality      G: Optimality

Prediction Problem



Giving up some properties



                                                              Hard problem


                     Generality Optimality Worst-case Integrality Determinism
                                           complexity

        → giving up one or more properties
                 allows us to design algorithms satisfying the remaining properties
                 might be sufficient for the task at hand




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                                G: Generality   G: Optimality

G: Generality




                                                              Hard problem


                       Generality Optimality Worst-case Integrality Determinism
                                             complexity




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality                    G: Optimality

G: Generality



Giving up Generality



                 Identify an interesting and tractable subset of instances


                                                                        Set of all instances


                                                Tractable subset




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality         G: Optimality

G: Generality



Example: MAP Inference in Markov Random Fields


        Although NP-hard in general, it is tractable...
            with low tree-width (Lauritzen, Spiegelhalter, 1988)
                 with binary states, pairwise submodular interactions (Boykov, Jolly,
                 2001)
                 with binary states, pairwise interactions (only), planar graph
                 structure (Globerson, Jaakkola, 2006)
                 with submodular pairwise interactions (Schlesinger, 2006)
                 with P n -Potts higher order factors (Kohli, Kumar, Torr, 2007)
                 with perfect graph structure (Jebara, 2009)




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                              G: Generality                  G: Optimality

G: Generality



Binary Graph-Cuts

     Energy function: unary and pairwise

     E (y ; x, w ) =                  EF (yF ; x, wtF )+              EF (yF ; x, wtF )
                             F ∈F1                            F ∈F2

     Restriction 1 (wlog)

                        EF (yi ; x, wtF ) ≥ 0

     Restriction 2
     (regular/submodular/attractive)

     EF (yi , yj ; x, wtF ) = 0,                                                if yi = yj ,
     EF (yi , yj ; x, wtF ) = EF (yj , yi ; x, wtF ) ≥ 0, otherwise.


Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                              G: Generality                  G: Optimality

G: Generality



Binary Graph-Cuts

     Energy function: unary and pairwise

     E (y ; x, w ) =                  EF (yF ; x, wtF )+              EF (yF ; x, wtF )
                             F ∈F1                            F ∈F2

     Restriction 1 (wlog)

                        EF (yi ; x, wtF ) ≥ 0

     Restriction 2
     (regular/submodular/attractive)

     EF (yi , yj ; x, wtF ) = 0,                                                if yi = yj ,
     EF (yi , yj ; x, wtF ) = EF (yj , yi ; x, wtF ) ≥ 0, otherwise.


Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality                                     G: Optimality

G: Generality



Binary Graph-Cuts (cont)

                Construct auxiliary undirected graph
                One node {i}i∈V per variable                                                            s
                                                                                           {i, s}
                Two extra nodes: source s, sink t
                Edges                                                         {i, t}         i          j       k
                     Edge        Graph cut weight
                     {i, j}      EF (yi = 0, yj = 1; x, wtF )                          l            m       n
                     {i, s}      EF (yi = 1; x, wtF )
                     {i, t}      EF (yi = 0; x, wtF )
                                                                                                    t
                Find linear s-t-mincut
                Solution defines optimal binary labeling
                of the original energy minimization
                problem


Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality   G: Optimality

G: Generality



Example: Figure-Ground Segmentation




                                         Input image (http://pdphoto.org)




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                             G: Generality   G: Optimality

G: Generality



Example: Figure-Ground Segmentation




                                                       Color model log-odds




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                             G: Generality   G: Optimality

G: Generality



Example: Figure-Ground Segmentation




                                                       Independent decisions




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                              G: Generality                                    G: Optimality

G: Generality



Example: Figure-Ground Segmentation




                       g (x, y , w ) =               log p(yi |xi ) + w               C (xi , xj )I (yi = yj )
                                               i∈V                          (i,j)∈E


                 Gradient strength
                                                                                           2
                                                     C (xi , xj ) = exp(γ xi − xj              )
                 γ estimated from mean edge strength (Blake et al, 2004)
                 w ≥ 0 controls smoothing
Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality   G: Optimality

G: Generality



Example: Figure-Ground Segmentation




                                                              w =0



Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                               G: Generality   G: Optimality

G: Generality



Example: Figure-Ground Segmentation




                                                              Small w > 0



Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                                G: Generality   G: Optimality

G: Generality



Example: Figure-Ground Segmentation




                                                              Medium w > 0



Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                               G: Generality   G: Optimality

G: Generality



Example: Figure-Ground Segmentation




                                                              Large w > 0




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                                 G: Generality                                G: Optimality

G: Generality



General Binary Case
                 Is there a larger class of energies for which binary graph cuts are
                 applicable?
                 (Kolmogorov and Zabih, 2004), (Freedman and Drineas, 2005)

        Theorem (Regular Binary Energies)
                      E (y ; x, w )          =                EF (yF ; x, wtF ) +           EF (yF ; x, wtF )
                                                     F ∈F1                          F ∈F2

        is a energy function of binary variables containing only unary and pairwise
        factors. The discrete energy minimization problem argminy E (y ; x, w ) is
        representable as a graph cut problem if and only if all pairwise energy
        functions EF for F ∈ F2 with F = {i, j} satisfy

                                  Ei,j (0, 0) + Ei,j (1, 1) ≤ Ei,j (0, 1) + Ei,j (1, 0).

Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                                 G: Generality                                G: Optimality

G: Generality



General Binary Case
                 Is there a larger class of energies for which binary graph cuts are
                 applicable?
                 (Kolmogorov and Zabih, 2004), (Freedman and Drineas, 2005)

        Theorem (Regular Binary Energies)
                      E (y ; x, w )          =                EF (yF ; x, wtF ) +           EF (yF ; x, wtF )
                                                     F ∈F1                          F ∈F2

        is a energy function of binary variables containing only unary and pairwise
        factors. The discrete energy minimization problem argminy E (y ; x, w ) is
        representable as a graph cut problem if and only if all pairwise energy
        functions EF for F ∈ F2 with F = {i, j} satisfy

                                  Ei,j (0, 0) + Ei,j (1, 1) ≤ Ei,j (0, 1) + Ei,j (1, 0).

Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                                 G: Generality                                   G: Optimality

G: Generality



Example: Class-independent Object Hypotheses
                 (Carreira and Sminchisescu, 2010)
                 PASCAL VOC 2009/2010 segmentation winner
                 Generate class-independent object hypotheses
                 Energy (almost) as before

                                g (x, y , w ) =               Ei (yi ) + w              C (xi , xj )I (yi = yj )
                                                        i∈V                   (i,j)∈E

                 Fixed unaries
                                                      
                                                       ∞            if i ∈ Vfg and yi = 0
                                           Ei (yi ) =   ∞            if i ∈ Vbg and yi = 1
                                                        0            otherwise
                                                      

                 Test all w ≥ 0 using parametric max-flow
                 (Picard and Queyranne, 1980), (Kolmogorov et al., 2007)
Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                                 G: Generality                                   G: Optimality

G: Generality



Example: Class-independent Object Hypotheses
                 (Carreira and Sminchisescu, 2010)
                 PASCAL VOC 2009/2010 segmentation winner
                 Generate class-independent object hypotheses
                 Energy (almost) as before

                                g (x, y , w ) =               Ei (yi ) + w              C (xi , xj )I (yi = yj )
                                                        i∈V                   (i,j)∈E

                 Fixed unaries
                                                      
                                                       ∞            if i ∈ Vfg and yi = 0
                                           Ei (yi ) =   ∞            if i ∈ Vbg and yi = 1
                                                        0            otherwise
                                                      

                 Test all w ≥ 0 using parametric max-flow
                 (Picard and Queyranne, 1980), (Kolmogorov et al., 2007)
Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                                 G: Generality                                   G: Optimality

G: Generality



Example: Class-independent Object Hypotheses
                 (Carreira and Sminchisescu, 2010)
                 PASCAL VOC 2009/2010 segmentation winner
                 Generate class-independent object hypotheses
                 Energy (almost) as before

                                g (x, y , w ) =               Ei (yi ) + w              C (xi , xj )I (yi = yj )
                                                        i∈V                   (i,j)∈E

                 Fixed unaries
                                                      
                                                       ∞            if i ∈ Vfg and yi = 0
                                           Ei (yi ) =   ∞            if i ∈ Vbg and yi = 1
                                                        0            otherwise
                                                      

                 Test all w ≥ 0 using parametric max-flow
                 (Picard and Queyranne, 1980), (Kolmogorov et al., 2007)
Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality   G: Optimality

G: Generality



Example: Class-independent Object Hypotheses (cont)




                                         Input image (http://pdphoto.org)



Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality     G: Optimality

G: Generality



Example: Class-independent Object Hypotheses (cont)




                CPMC proposal segmentations (Carreira and Sminchisescu, 2010)



Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                                 G: Generality   G: Optimality

G: Optimality




                                                              Hard problem


                     Generality Optimality Worst-case Integrality Determinism
                                           complexity




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality         G: Optimality

G: Optimality



Giving up Optimality


        Solving for y ∗ is hard, but is it necessary?
                 pragmatic motivation: in many applications a close-to-optimal
                 solution is good enough
                 computational motivation: set of “good” solutions might be large
                 and finding just one element can be easy

        For machine learning models
            modeling error: we always use the wrong model
                 estimation error: preference for y ∗ might be an artifact




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality         G: Optimality

G: Optimality



Giving up Optimality


        Solving for y ∗ is hard, but is it necessary?
                 pragmatic motivation: in many applications a close-to-optimal
                 solution is good enough
                 computational motivation: set of “good” solutions might be large
                 and finding just one element can be easy

        For machine learning models
            modeling error: we always use the wrong model
                 estimation error: preference for y ∗ might be an artifact




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality       G: Optimality

G: Optimality



Local Search



                                                                              Y

                                 y0




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality       G: Optimality

G: Optimality



Local Search



                                                                              Y

                                 y0
                N (y 0 )




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality       G: Optimality

G: Optimality



Local Search



                                                                              Y

                                 y0                 y1
                N (y 0 )




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                             G: Generality                            G: Optimality

G: Optimality



Local Search



                                                                                             Y

                                 y0                 y1         y2                 y3         N (y 3 )
                N (y 0 )                                N (y 1 ) N (y 2 )
                                                                                        y∗
                                                                                   ∗
                                                                               N (y )




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                               G: Generality                              G: Optimality

G: Optimality



Local Search

                                                                                                 Y

                                 y0                 y1          y2                   y3          N (y 3 )
                N (y 0 )                                N (y 1 ) N (y 2 )
                                                                                            y∗
                                                                                 N (y ∗ )

                 Nt : Y → 2Y , neighborhood system
                 Optimization with respect to Nt (y ) must be tractable:

                                                          y t+1 = argmax g (x, y )
                                                                  y ∈Nt (y t )


Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality                         G: Optimality

G: Optimality



Example: Iterated Conditional Modes (ICM)
        Iterated Conditional Modes (ICM), (Besag, 1986)




                 g (x, y ) = log p(y |x)
                 y ∗ = argmax log p(y |x)
                               y ∈Y
                 Neighborhoods Ns (y ) = {(y1 , . . . , ys−1 , zs , ys+1 , . . . , yS )|zs ∈ Ys }

Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality   G: Optimality

G: Optimality



Example: Iterated Conditional Modes (ICM)

        Iterated Conditional Modes (ICM), (Besag, 1986)




                 y t+1 = argmax log p(y1 , y2 , . . . , yV |x)
                                            t            t
                                 y1 ∈Y1




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality   G: Optimality

G: Optimality



Example: Iterated Conditional Modes (ICM)

        Iterated Conditional Modes (ICM), (Besag, 1986)




                 y t+1 = argmax log p(y1 , y2 , y3 , . . . , yV |x)
                                       t         t            t
                                 y2 ∈Y2




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality        G: Optimality

G: Optimality



Neighborhood Size




                 ICM neighborhood Nt (y t ): all states reachable from y t by changing
                 a single variable (Besag, 1986)
                 Neighborhood size: in general, larger is better (VLSN, Ahuja, 2000)
                 Example: neighborhood along chains




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality   G: Optimality

G: Optimality



Example: Block ICM
        Block Iterated Conditional Modes (ICM)
        (Kelm et al., 2006), (Kittler and F¨glein, 1984)
                                           o




                                             t
                 y t+1 = argmax log p(yC1 , yV C1 |x)
                                yC1 ∈YC1



Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality   G: Optimality

G: Optimality



Example: Block ICM
        Block Iterated Conditional Modes (ICM)
        (Kelm et al., 2006), (Kittler and F¨glein, 1984)
                                           o




                                             t
                 y t+1 = argmax log p(yC2 , yV C2 |x)
                                yC2 ∈YC2



Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality          G: Optimality

G: Optimality



Example: Multilabel Graph-Cut




                 Binary graph-cuts are not applicable to multilabel energy
                 minimization problems
                 (Boykov et al., 2001): two local search algorithms for multilabel
                 problems
                 Sequence of binary directed s-t-mincut problems
                 Iteratively improve multilabel solution




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality                         G: Optimality

G: Optimality



α-β Swap Neighborhood



                 Select two different labels α and β
                 Fix all variables i for which yi ∈ {α, β}
                                                  /
                 Optimize over remaining i with yi ∈ {α, β}


                                        Nα,β : Y × N × N → 2Y ,
                                        Nα,β (y , α, β) := {z ∈ Y : zi = yi if yi ∈ {α, β},
                                                                                  /
                                                                          otherwise zi ∈ {α, β}}.




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality   G: Optimality

G: Optimality



α-β-swap illustrated




                 5-label problem
                 α − β-swap


Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality   G: Optimality

G: Optimality



α-β-swap illustrated




                 5-label problem
                 α − β-swap


Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality   G: Optimality

G: Optimality



α-β-swap illustrated




                 5-label problem
                 α − β-swap


Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality   G: Optimality

G: Optimality



α-β-swap illustrated




                 5-label problem
                 α − β-swap


Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality   G: Optimality

G: Optimality



α-β-swap illustrated




                 5-label problem
                 α − β-swap


Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                                G: Generality                G: Optimality

G: Optimality



α-β-swap derivation




                                             y t+1       =        argmin           E (y ; x)
                                                              y ∈Nα,β (y t ,α,β)



                 Constant: drop out
                 Unary: combine
                 Pairwise: binary pairwise




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality                                       G: Optimality

G: Optimality



α-β-swap derivation




                     y t+1        =            argmin              Ei (yi ; x) +             Ei,j (yi , yj ; x)
                                          y ∈Nα,β (y t ,α,β) i∈V
                                                                                   (i,j)∈E



                 Constant: drop out
                 Unary: combine
                 Pairwise: binary pairwise




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                                    G: Generality                                        G: Optimality

G: Optimality



α-β-swap derivation


        y t+1        =            argmin                              Ei (yit ; x) +                 Ei (yi ; x)
                            y ∈Nα,β (y t ,α,β)               i∈V ,                          i∈V ,
                                                         yit ∈{α,β}
                                                             /                          yit ∈{α,β}

                                 +                            Ei,j (yit , yjt ; x) +                       Ei,j (yi , yjt ; x)
                                            (i,j)∈E ,                                         (i,j)∈E ,
                                     yit ∈{α,β},yjt ∈{α,β}
                                         /          /                                  yit ∈{α,β},yjt ∈{α,β}
                                                                                                      /

                                 +                            Ei,j (yit , yj ; x) +                       Ei,j (yi , yj ; x) .
                                            (i,j)∈E ,                                         (i,j)∈E ,
                                     yit ∈{α,β},yjt ∈{α,β}
                                         /                                             yit ∈{α,β},yjt ∈{α,β}



                 Constant: drop out
                 Unary: combine
                 Pairwise: binary pairwise
Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                                   G: Generality                                          G: Optimality

G: Optimality



α-β-swap derivation


        y t+1        =            argmin                             Ei (yit ; x) +                  Ei (yi ; x)
                             y ∈Nα,β (y t ,α,β)             i∈V ,                           i∈V ,
                                                        yit ∈{α,β}
                                                            /                           yit ∈{α,β}

                                 +                            Ei,j (yit , yjt ; x) +                       Ei,j (yi , yjt ; x)
                                           (i,j)∈E ,                                          (i,j)∈E ,
                                    yit ∈{α,β},yjt ∈{α,β}
                                        /          /                                   yit ∈{α,β},yjt ∈{α,β}
                                                                                                      /

                                 +                            Ei,j (yit , yj ; x) +                       Ei,j (yi , yj ; x) .
                                           (i,j)∈E ,                                          (i,j)∈E ,
                                    yit ∈{α,β},yjt ∈{α,β}
                                        /                                              yit ∈{α,β},yjt ∈{α,β}



                 Constant: drop out
                 Unary: combine
                 Pairwise: binary pairwise
Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                                 G: Generality                                        G: Optimality

G: Optimality



α-β-swap graph construction
                Directed graph G = (V , E )
                                                                                            tα               α         tα
                V         =     {α, β} ∪ {i ∈ V : yi ∈ {α, β}},                              i
                                                                                                        tα
                                                                                                                        k
                                                                                                         j
                E         =     {(α, i, tiα ) : ∀i ∈ V : yi ∈ {α, β}} ∪                          ni,j

                                {(i, β, tiβ )   : ∀i ∈ V : yi ∈ {α, β}} ∪
                                                                                        i                j       ...    k
                                                                                                 ni,j
                                {(i, j, ni,j ) : ∀(i, j), (j, i) ∈ E : yi , yj ∈ {α, β}}.               tβ
                                                                                                         j             tβ
                                                                                            tβ
                                                                                             i               β
                                                                                                                        k

                Edge weights tiα , tiβ , and ni,j

                 ni,j         = Ei,j (α, β; x)
                    tiα       = Ei (α; x) +                     Ei,j (α, yj ; x)
                                                     (i,j)∈E,
                                                    yj ∈{α,β}
                                                       /

                    tiβ       = Ei (β; x) +                     Ei,j (β, yj ; x)
                                                     (i,j)∈E,
                                                    yj ∈{α,β}
                                                       /


Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                                 G: Generality                                        G: Optimality

G: Optimality



α-β-swap graph construction
                Directed graph G = (V , E )
                                                                                            tα               α         tα
                V         =     {α, β} ∪ {i ∈ V : yi ∈ {α, β}},                              i
                                                                                                        tα
                                                                                                                        k
                                                                                                         j
                E         =     {(α, i, tiα ) : ∀i ∈ V : yi ∈ {α, β}} ∪                          ni,j

                                {(i, β, tiβ )   : ∀i ∈ V : yi ∈ {α, β}} ∪
                                                                                        i                j       ...    k
                                                                                                 ni,j
                                {(i, j, ni,j ) : ∀(i, j), (j, i) ∈ E : yi , yj ∈ {α, β}}.               tβ
                                                                                                         j             tβ
                                                                                            tβ
                                                                                             i               β
                                                                                                                        k

                Edge weights tiα , tiβ , and ni,j

                 ni,j         = Ei,j (α, β; x)
                    tiα       = Ei (α; x) +                     Ei,j (α, yj ; x)
                                                     (i,j)∈E,
                                                    yj ∈{α,β}
                                                       /

                    tiβ       = Ei (β; x) +                     Ei,j (β, yj ; x)
                                                     (i,j)∈E,
                                                    yj ∈{α,β}
                                                       /


Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                                       G: Generality                G: Optimality

G: Optimality



α-β-swap move

                                                        tα
                                                         i
                                                                          α              tα
                                                                                          k
                                                                    tα
                                                                     j
                                                          ni,j
                                                    i                j            ...     k
                                                             ni,j
                                                                    tβ
                                                                     j                   tβ
                                                     tβ
                                                      i                   β
                                                                                          k



                 Side of cut determines yi ∈ {α, β}
                 Iterate all possible (α, β) combinations
                 Semi-metric requirement on pairwise energies
                                                        Ei,j (yi , yj ; x) = 0 ⇔ yi = yj
                                                        Ei,j (yi , yj ; x) = Ei,j (yj , yi ; x) ≥ 0

Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                                  G: Generality                     G: Optimality

G: Optimality



α-β-swap move

                                                    C
                                                                      α

                                                    i             j          ...     k

                                                                      β
                 Side of cut determines yi ∈ {α, β}
                 Iterate all possible (α, β) combinations
                 Semi-metric requirement on pairwise energies
                                                        Ei,j (yi , yj ; x) = 0 ⇔ yi = yj
                                                        Ei,j (yi , yj ; x) = Ei,j (yj , yi ; x) ≥ 0
Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                                  G: Generality                     G: Optimality

G: Optimality



α-β-swap move

                                                    C
                                                                      α

                                                    i             j          ...     k

                                                                      β
                 Side of cut determines yi ∈ {α, β}
                 Iterate all possible (α, β) combinations
                 Semi-metric requirement on pairwise energies
                                                        Ei,j (yi , yj ; x) = 0 ⇔ yi = yj
                                                        Ei,j (yi , yj ; x) = Ei,j (yj , yi ; x) ≥ 0
Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality   G: Optimality

G: Optimality



Example: Stereo Disparity Estimation




                 Infer depth from two images
                 Discretized multi-label problem
                 α-expansion solution close to optimal


Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality   G: Optimality

G: Optimality



Example: Stereo Disparity Estimation




                 Infer depth from two images
                 Discretized multi-label problem
                 α-expansion solution close to optimal


Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality   G: Optimality

G: Optimality



Model Reduction




                 Energy minimization problem: many decision to make jointly
                 Model reduction
                     1. Fix a subset of decisions
                     2. Optimize the smaller remaining model

                 Example: forcing yi = yj for pairs (i, j)




Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality      G: Optimality

G: Optimality



Example: Superpixels in Labeling Problems




                            Input image: 500-by-375 pixels (187,500 decisions)


Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)
Prediction Problem                                            G: Generality      G: Optimality

G: Optimality



Example: Superpixels in Labeling Problems




                                    Image with 149 superpixels (149 decisions)


Sebastian Nowozin and Christoph H. Lampert
Part 6: Structured Prediction and Energy Minimization (1/2)

More Related Content

What's hot

Supersymmetric Q-balls and boson stars in (d + 1) dimensions - Jena Talk Mar ...
Supersymmetric Q-balls and boson stars in (d + 1) dimensions - Jena Talk Mar ...Supersymmetric Q-balls and boson stars in (d + 1) dimensions - Jena Talk Mar ...
Supersymmetric Q-balls and boson stars in (d + 1) dimensions - Jena Talk Mar ...Jurgen Riedel
 
Lecture on solving1
Lecture on solving1Lecture on solving1
Lecture on solving1NBER
 
2003 Ames.Models
2003 Ames.Models2003 Ames.Models
2003 Ames.Modelspinchung
 
Supersymmetric Q-balls and boson stars in (d + 1) dimensions
Supersymmetric Q-balls and boson stars in (d + 1) dimensionsSupersymmetric Q-balls and boson stars in (d + 1) dimensions
Supersymmetric Q-balls and boson stars in (d + 1) dimensionsJurgen Riedel
 
Introduction to inverse problems
Introduction to inverse problemsIntroduction to inverse problems
Introduction to inverse problemsDelta Pi Systems
 
Chapter 4 likelihood
Chapter 4 likelihoodChapter 4 likelihood
Chapter 4 likelihoodNBER
 
Neural network Algos formulas
Neural network Algos formulasNeural network Algos formulas
Neural network Algos formulasZarnigar Altaf
 
Dual Gravitons in AdS4/CFT3 and the Holographic Cotton Tensor
Dual Gravitons in AdS4/CFT3 and the Holographic Cotton TensorDual Gravitons in AdS4/CFT3 and the Holographic Cotton Tensor
Dual Gravitons in AdS4/CFT3 and the Holographic Cotton TensorSebastian De Haro
 
An order seven implicit symmetric sheme applied to second order initial value...
An order seven implicit symmetric sheme applied to second order initial value...An order seven implicit symmetric sheme applied to second order initial value...
An order seven implicit symmetric sheme applied to second order initial value...Alexander Decker
 
Chapter2: Likelihood-based approach
Chapter2: Likelihood-based approach Chapter2: Likelihood-based approach
Chapter2: Likelihood-based approach Jae-kwang Kim
 
6 adesh kumar tripathi -71-74
6 adesh kumar tripathi -71-746 adesh kumar tripathi -71-74
6 adesh kumar tripathi -71-74Alexander Decker
 
Principal component analysis and matrix factorizations for learning (part 3) ...
Principal component analysis and matrix factorizations for learning (part 3) ...Principal component analysis and matrix factorizations for learning (part 3) ...
Principal component analysis and matrix factorizations for learning (part 3) ...zukun
 
Introduction to FDA and linear models
 Introduction to FDA and linear models Introduction to FDA and linear models
Introduction to FDA and linear modelstuxette
 
Jackknife algorithm for the estimation of logistic regression parameters
Jackknife algorithm for the estimation of logistic regression parametersJackknife algorithm for the estimation of logistic regression parameters
Jackknife algorithm for the estimation of logistic regression parametersAlexander Decker
 

What's hot (20)

Supersymmetric Q-balls and boson stars in (d + 1) dimensions - Jena Talk Mar ...
Supersymmetric Q-balls and boson stars in (d + 1) dimensions - Jena Talk Mar ...Supersymmetric Q-balls and boson stars in (d + 1) dimensions - Jena Talk Mar ...
Supersymmetric Q-balls and boson stars in (d + 1) dimensions - Jena Talk Mar ...
 
Lecture on solving1
Lecture on solving1Lecture on solving1
Lecture on solving1
 
Holographic Cotton Tensor
Holographic Cotton TensorHolographic Cotton Tensor
Holographic Cotton Tensor
 
2003 Ames.Models
2003 Ames.Models2003 Ames.Models
2003 Ames.Models
 
Supersymmetric Q-balls and boson stars in (d + 1) dimensions
Supersymmetric Q-balls and boson stars in (d + 1) dimensionsSupersymmetric Q-balls and boson stars in (d + 1) dimensions
Supersymmetric Q-balls and boson stars in (d + 1) dimensions
 
Introduction to inverse problems
Introduction to inverse problemsIntroduction to inverse problems
Introduction to inverse problems
 
Chapter 4 likelihood
Chapter 4 likelihoodChapter 4 likelihood
Chapter 4 likelihood
 
Neural network Algos formulas
Neural network Algos formulasNeural network Algos formulas
Neural network Algos formulas
 
Dual Gravitons in AdS4/CFT3 and the Holographic Cotton Tensor
Dual Gravitons in AdS4/CFT3 and the Holographic Cotton TensorDual Gravitons in AdS4/CFT3 and the Holographic Cotton Tensor
Dual Gravitons in AdS4/CFT3 and the Holographic Cotton Tensor
 
Fdtd
FdtdFdtd
Fdtd
 
CMU_13
CMU_13CMU_13
CMU_13
 
An order seven implicit symmetric sheme applied to second order initial value...
An order seven implicit symmetric sheme applied to second order initial value...An order seven implicit symmetric sheme applied to second order initial value...
An order seven implicit symmetric sheme applied to second order initial value...
 
Chapter2: Likelihood-based approach
Chapter2: Likelihood-based approach Chapter2: Likelihood-based approach
Chapter2: Likelihood-based approach
 
6 adesh kumar tripathi -71-74
6 adesh kumar tripathi -71-746 adesh kumar tripathi -71-74
6 adesh kumar tripathi -71-74
 
Symmetrical2
Symmetrical2Symmetrical2
Symmetrical2
 
presentazione
presentazionepresentazione
presentazione
 
Principal component analysis and matrix factorizations for learning (part 3) ...
Principal component analysis and matrix factorizations for learning (part 3) ...Principal component analysis and matrix factorizations for learning (part 3) ...
Principal component analysis and matrix factorizations for learning (part 3) ...
 
Introduction to FDA and linear models
 Introduction to FDA and linear models Introduction to FDA and linear models
Introduction to FDA and linear models
 
Propensity albert
Propensity albertPropensity albert
Propensity albert
 
Jackknife algorithm for the estimation of logistic regression parameters
Jackknife algorithm for the estimation of logistic regression parametersJackknife algorithm for the estimation of logistic regression parameters
Jackknife algorithm for the estimation of logistic regression parameters
 

Viewers also liked

optimization simplex method introduction
optimization simplex method introductionoptimization simplex method introduction
optimization simplex method introductionKunal Shinde
 
Daa:Dynamic Programing
Daa:Dynamic ProgramingDaa:Dynamic Programing
Daa:Dynamic Programingrupali_2bonde
 
Baehyun min pareto_optimality
Baehyun min pareto_optimalityBaehyun min pareto_optimality
Baehyun min pareto_optimalityBaehyun Min
 
Mathematical modeling and parameter estimation for water quality management s...
Mathematical modeling and parameter estimation for water quality management s...Mathematical modeling and parameter estimation for water quality management s...
Mathematical modeling and parameter estimation for water quality management s...Kamal Pradhan
 
Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...
Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...
Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...AtakanAral
 
Dynamic programming
Dynamic programmingDynamic programming
Dynamic programmingShakil Ahmed
 
Knapsack problem and Memory Function
Knapsack problem and Memory FunctionKnapsack problem and Memory Function
Knapsack problem and Memory FunctionBarani Tharan
 
Lecture 8 dynamic programming
Lecture 8 dynamic programmingLecture 8 dynamic programming
Lecture 8 dynamic programmingOye Tu
 
Simplex method - Maximisation Case
Simplex method - Maximisation CaseSimplex method - Maximisation Case
Simplex method - Maximisation CaseJoseph Konnully
 
Dynamic Programming
Dynamic ProgrammingDynamic Programming
Dynamic ProgrammingSahil Kumar
 
Simplex method
Simplex methodSimplex method
Simplex methodtatteya
 
Special Cases in Simplex Method
Special Cases in Simplex MethodSpecial Cases in Simplex Method
Special Cases in Simplex MethodDivyansh Verma
 
Linear programming using the simplex method
Linear programming using the simplex methodLinear programming using the simplex method
Linear programming using the simplex methodShivek Khurana
 
Operation Research (Simplex Method)
Operation Research (Simplex Method)Operation Research (Simplex Method)
Operation Research (Simplex Method)Shivani Gautam
 

Viewers also liked (20)

Amarish
AmarishAmarish
Amarish
 
optimization simplex method introduction
optimization simplex method introductionoptimization simplex method introduction
optimization simplex method introduction
 
A star
A starA star
A star
 
Daa:Dynamic Programing
Daa:Dynamic ProgramingDaa:Dynamic Programing
Daa:Dynamic Programing
 
Baehyun min pareto_optimality
Baehyun min pareto_optimalityBaehyun min pareto_optimality
Baehyun min pareto_optimality
 
Mathematical modeling and parameter estimation for water quality management s...
Mathematical modeling and parameter estimation for water quality management s...Mathematical modeling and parameter estimation for water quality management s...
Mathematical modeling and parameter estimation for water quality management s...
 
Linear programming
Linear programmingLinear programming
Linear programming
 
Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...
Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...
Modeling and Optimization of Resource Allocation in Cloud [PhD Thesis Progres...
 
aaoczc2252
aaoczc2252aaoczc2252
aaoczc2252
 
Dynamic programming
Dynamic programmingDynamic programming
Dynamic programming
 
Knapsack problem and Memory Function
Knapsack problem and Memory FunctionKnapsack problem and Memory Function
Knapsack problem and Memory Function
 
Lecture 8 dynamic programming
Lecture 8 dynamic programmingLecture 8 dynamic programming
Lecture 8 dynamic programming
 
Dynamic pgmming
Dynamic pgmmingDynamic pgmming
Dynamic pgmming
 
Simplex method - Maximisation Case
Simplex method - Maximisation CaseSimplex method - Maximisation Case
Simplex method - Maximisation Case
 
Dynamic Programming
Dynamic ProgrammingDynamic Programming
Dynamic Programming
 
Simplex method
Simplex methodSimplex method
Simplex method
 
Simplex method
Simplex methodSimplex method
Simplex method
 
Special Cases in Simplex Method
Special Cases in Simplex MethodSpecial Cases in Simplex Method
Special Cases in Simplex Method
 
Linear programming using the simplex method
Linear programming using the simplex methodLinear programming using the simplex method
Linear programming using the simplex method
 
Operation Research (Simplex Method)
Operation Research (Simplex Method)Operation Research (Simplex Method)
Operation Research (Simplex Method)
 

Similar to 04 structured prediction and energy minimization part 1

Slides2 130201091056-phpapp01
Slides2 130201091056-phpapp01Slides2 130201091056-phpapp01
Slides2 130201091056-phpapp01Deb Roy
 
Bayesian case studies, practical 2
Bayesian case studies, practical 2Bayesian case studies, practical 2
Bayesian case studies, practical 2Robin Ryder
 
A common fixed point of integral type contraction in generalized metric spacess
A  common fixed point of integral type contraction in generalized metric spacessA  common fixed point of integral type contraction in generalized metric spacess
A common fixed point of integral type contraction in generalized metric spacessAlexander Decker
 
Algebra 2 Unit 5 Lesson 7
Algebra 2 Unit 5 Lesson 7Algebra 2 Unit 5 Lesson 7
Algebra 2 Unit 5 Lesson 7Kate Nowak
 
Uncoupled Regression from Pairwise Comparison Data
Uncoupled Regression from Pairwise Comparison DataUncoupled Regression from Pairwise Comparison Data
Uncoupled Regression from Pairwise Comparison DataLiyuan Xu
 
Bayesian regression models and treed Gaussian process models
Bayesian regression models and treed Gaussian process modelsBayesian regression models and treed Gaussian process models
Bayesian regression models and treed Gaussian process modelsTommaso Rigon
 
Rainone - Groups St. Andrew 2013
Rainone - Groups St. Andrew 2013Rainone - Groups St. Andrew 2013
Rainone - Groups St. Andrew 2013Raffaele Rainone
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
5.1 Defining and visualizing functions. A handout.
5.1 Defining and visualizing functions. A handout.5.1 Defining and visualizing functions. A handout.
5.1 Defining and visualizing functions. A handout.Jan Plaza
 
10.11648.j.pamj.20170601.11
10.11648.j.pamj.20170601.1110.11648.j.pamj.20170601.11
10.11648.j.pamj.20170601.11DAVID GIKUNJU
 
An introduction to quantum stochastic calculus
An introduction to quantum stochastic calculusAn introduction to quantum stochastic calculus
An introduction to quantum stochastic calculusSpringer
 
Improved Trainings of Wasserstein GANs (WGAN-GP)
Improved Trainings of Wasserstein GANs (WGAN-GP)Improved Trainings of Wasserstein GANs (WGAN-GP)
Improved Trainings of Wasserstein GANs (WGAN-GP)Sangwoo Mo
 
5. cem granger causality ecm
5. cem granger causality  ecm 5. cem granger causality  ecm
5. cem granger causality ecm Quang Hoang
 
Applications of the surface finite element method
Applications of the surface finite element methodApplications of the surface finite element method
Applications of the surface finite element methodtr1987
 
The multilayer perceptron
The multilayer perceptronThe multilayer perceptron
The multilayer perceptronESCOM
 
Neutrosophic Soft Topological Spaces on New Operations
Neutrosophic Soft Topological Spaces on New OperationsNeutrosophic Soft Topological Spaces on New Operations
Neutrosophic Soft Topological Spaces on New OperationsIJSRED
 

Similar to 04 structured prediction and energy minimization part 1 (20)

Slides2 130201091056-phpapp01
Slides2 130201091056-phpapp01Slides2 130201091056-phpapp01
Slides2 130201091056-phpapp01
 
Bayesian case studies, practical 2
Bayesian case studies, practical 2Bayesian case studies, practical 2
Bayesian case studies, practical 2
 
A common fixed point of integral type contraction in generalized metric spacess
A  common fixed point of integral type contraction in generalized metric spacessA  common fixed point of integral type contraction in generalized metric spacess
A common fixed point of integral type contraction in generalized metric spacess
 
Algebra 2 Unit 5 Lesson 7
Algebra 2 Unit 5 Lesson 7Algebra 2 Unit 5 Lesson 7
Algebra 2 Unit 5 Lesson 7
 
Uncoupled Regression from Pairwise Comparison Data
Uncoupled Regression from Pairwise Comparison DataUncoupled Regression from Pairwise Comparison Data
Uncoupled Regression from Pairwise Comparison Data
 
Lagrange
LagrangeLagrange
Lagrange
 
Funcion gamma
Funcion gammaFuncion gamma
Funcion gamma
 
Bayesian regression models and treed Gaussian process models
Bayesian regression models and treed Gaussian process modelsBayesian regression models and treed Gaussian process models
Bayesian regression models and treed Gaussian process models
 
Rainone - Groups St. Andrew 2013
Rainone - Groups St. Andrew 2013Rainone - Groups St. Andrew 2013
Rainone - Groups St. Andrew 2013
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
MNAR
MNARMNAR
MNAR
 
5.1 Defining and visualizing functions. A handout.
5.1 Defining and visualizing functions. A handout.5.1 Defining and visualizing functions. A handout.
5.1 Defining and visualizing functions. A handout.
 
10.11648.j.pamj.20170601.11
10.11648.j.pamj.20170601.1110.11648.j.pamj.20170601.11
10.11648.j.pamj.20170601.11
 
YSC 2013
YSC 2013YSC 2013
YSC 2013
 
An introduction to quantum stochastic calculus
An introduction to quantum stochastic calculusAn introduction to quantum stochastic calculus
An introduction to quantum stochastic calculus
 
Improved Trainings of Wasserstein GANs (WGAN-GP)
Improved Trainings of Wasserstein GANs (WGAN-GP)Improved Trainings of Wasserstein GANs (WGAN-GP)
Improved Trainings of Wasserstein GANs (WGAN-GP)
 
5. cem granger causality ecm
5. cem granger causality  ecm 5. cem granger causality  ecm
5. cem granger causality ecm
 
Applications of the surface finite element method
Applications of the surface finite element methodApplications of the surface finite element method
Applications of the surface finite element method
 
The multilayer perceptron
The multilayer perceptronThe multilayer perceptron
The multilayer perceptron
 
Neutrosophic Soft Topological Spaces on New Operations
Neutrosophic Soft Topological Spaces on New OperationsNeutrosophic Soft Topological Spaces on New Operations
Neutrosophic Soft Topological Spaces on New Operations
 

More from zukun

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009zukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVzukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Informationzukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statisticszukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibrationzukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionzukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluationzukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-softwarezukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptorszukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectorszukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-introzukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video searchzukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video searchzukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionzukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick startzukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structureszukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities zukun
 

More from zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 

Recently uploaded

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 

Recently uploaded (20)

"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 

04 structured prediction and energy minimization part 1

  • 1. Prediction Problem G: Generality G: Optimality Part 6: Structured Prediction and Energy Minimization (1/2) Sebastian Nowozin and Christoph H. Lampert Colorado Springs, 25th June 2011 Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 2. Prediction Problem G: Generality G: Optimality Prediction Problem Prediction Problem y ∗ = f (x) = argmax g (x, y ) y ∈Y g (x, y ) = p(y |x), factor graphs/MRF/CRF, g (x, y ) = −E (y ; x, w ), factor graphs/MRF/CRF, g (x, y ) = w , ψ(x, y ) , linear model (e.g. multiclass SVM), → difficulty: Y finite but very large Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 3. Prediction Problem G: Generality G: Optimality Prediction Problem Prediction Problem y ∗ = f (x) = argmax g (x, y ) y ∈Y g (x, y ) = p(y |x), factor graphs/MRF/CRF, g (x, y ) = −E (y ; x, w ), factor graphs/MRF/CRF, g (x, y ) = w , ψ(x, y ) , linear model (e.g. multiclass SVM), → difficulty: Y finite but very large Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 4. Prediction Problem G: Generality G: Optimality Prediction Problem Prediction Prblem (cont) Definition (Optimization Problem) Given (g , Y, G, x), with feasible set Y ⊆ G over decision domain G, and given an input instance x ∈ X and an objective function g : X × G → R, find the optimal value α = sup g (x, y ), y ∈Y and, if the supremum exists, find an optimal solution y ∗ ∈ Y such that g (x, y ∗ ) = α. Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 5. Prediction Problem G: Generality G: Optimality Prediction Problem The feasible set Ingredients Decision domain G, typically simple (G = Rd , G = 2V , etc.) Feasible set Y ⊆ G, defining the problem-specific structure Objective function g : X × G → R. Terminology Y = G: unconstrained optimization problem, G finite: discrete optimization problem, G = 2Σ for ground set Σ: combinatorial optimization problem, Y = ∅: infeasible problem. Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 6. Prediction Problem G: Generality G: Optimality Prediction Problem Example: Feasible Sets (cont) Jij yi yj Jjk yj yk Yi Yj Yk (+1) (−1) (−1) hi yi hj yj hk yk Ising model with external field Graph G = (V , E ) “External field”: h ∈ RV Interaction matrix: J ∈ RV ×V Objective, defined on yi ∈ {−1, 1} 1 1 g (y ) = hi yi + hj yj + hk yk + Jij yi yj + Jjk yj yk 2 2 Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 7. Prediction Problem G: Generality G: Optimality Prediction Problem Example: Feasible Sets (cont) Ising model with external field Y = G = {−1, +1}V 1 g (y ) = Ji,j yi yj + hi yi 2 (i,j)∈E i∈V Unconstrained Objective function contains quadratic terms Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 8. Prediction Problem G: Generality G: Optimality Prediction Problem Example: Feasible Sets (cont) G = {0, 1}(V ×{−1,+1})∪(E ×{−1,+1}×{−1,+1}) , Y = {y ∈ G : ∀i ∈ V : yi,−1 + yi,+1 = 1, ∀(i, j) ∈ E : yi,j,+1,+1 + yi,j,+1,−1 = yi,+1 , ∀(i, j) ∈ E : yi,j,−1,+1 + yi,j,−1,−1 = yi,−1 }, 1 g (y ) = Ji,j (yi,j,+1,+1 + yi,j,−1,−1 ) 2 (i,j)∈E 1 − Ji,j (yi,j,+1,−1 + yi,j,−1,+1 ) 2 (i,j)∈E + hi (yi,+1 − yi,−1 ) i∈V Constrained, more variables Objective function contains linear terms only Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 9. Prediction Problem G: Generality G: Optimality Prediction Problem Evaluating f : what do we want? f (x) = argmax g (x, y ) y ∈Y For evaluating f (x) we want an algorithm that 1. is general: applicable to all instances of the problem, 2. is optimal: provides an optimal y ∗ , 3. has good worst-case complexity: for all instances the runtime and space is acceptably bounded, 4. is integral: its solutions are restricted to Y, 5. is deterministic: its results and runtime are reproducible and depend on the input data only. Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 10. Prediction Problem G: Generality G: Optimality Prediction Problem Evaluating f : what do we want? f (x) = argmax g (x, y ) y ∈Y For evaluating f (x) we want an algorithm that 1. is general: applicable to all instances of the problem, 2. is optimal: provides an optimal y ∗ , 3. has good worst-case complexity: for all instances the runtime and space is acceptably bounded, 4. is integral: its solutions are restricted to Y, 5. is deterministic: its results and runtime are reproducible and depend on the input data only. wanting all of them → impossible Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 11. Prediction Problem G: Generality G: Optimality Prediction Problem Giving up some properties Hard problem Generality Optimality Worst-case Integrality Determinism complexity → giving up one or more properties allows us to design algorithms satisfying the remaining properties might be sufficient for the task at hand Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 12. Prediction Problem G: Generality G: Optimality G: Generality Hard problem Generality Optimality Worst-case Integrality Determinism complexity Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 13. Prediction Problem G: Generality G: Optimality G: Generality Giving up Generality Identify an interesting and tractable subset of instances Set of all instances Tractable subset Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 14. Prediction Problem G: Generality G: Optimality G: Generality Example: MAP Inference in Markov Random Fields Although NP-hard in general, it is tractable... with low tree-width (Lauritzen, Spiegelhalter, 1988) with binary states, pairwise submodular interactions (Boykov, Jolly, 2001) with binary states, pairwise interactions (only), planar graph structure (Globerson, Jaakkola, 2006) with submodular pairwise interactions (Schlesinger, 2006) with P n -Potts higher order factors (Kohli, Kumar, Torr, 2007) with perfect graph structure (Jebara, 2009) Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 15. Prediction Problem G: Generality G: Optimality G: Generality Binary Graph-Cuts Energy function: unary and pairwise E (y ; x, w ) = EF (yF ; x, wtF )+ EF (yF ; x, wtF ) F ∈F1 F ∈F2 Restriction 1 (wlog) EF (yi ; x, wtF ) ≥ 0 Restriction 2 (regular/submodular/attractive) EF (yi , yj ; x, wtF ) = 0, if yi = yj , EF (yi , yj ; x, wtF ) = EF (yj , yi ; x, wtF ) ≥ 0, otherwise. Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 16. Prediction Problem G: Generality G: Optimality G: Generality Binary Graph-Cuts Energy function: unary and pairwise E (y ; x, w ) = EF (yF ; x, wtF )+ EF (yF ; x, wtF ) F ∈F1 F ∈F2 Restriction 1 (wlog) EF (yi ; x, wtF ) ≥ 0 Restriction 2 (regular/submodular/attractive) EF (yi , yj ; x, wtF ) = 0, if yi = yj , EF (yi , yj ; x, wtF ) = EF (yj , yi ; x, wtF ) ≥ 0, otherwise. Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 17. Prediction Problem G: Generality G: Optimality G: Generality Binary Graph-Cuts (cont) Construct auxiliary undirected graph One node {i}i∈V per variable s {i, s} Two extra nodes: source s, sink t Edges {i, t} i j k Edge Graph cut weight {i, j} EF (yi = 0, yj = 1; x, wtF ) l m n {i, s} EF (yi = 1; x, wtF ) {i, t} EF (yi = 0; x, wtF ) t Find linear s-t-mincut Solution defines optimal binary labeling of the original energy minimization problem Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 18. Prediction Problem G: Generality G: Optimality G: Generality Example: Figure-Ground Segmentation Input image (http://pdphoto.org) Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 19. Prediction Problem G: Generality G: Optimality G: Generality Example: Figure-Ground Segmentation Color model log-odds Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 20. Prediction Problem G: Generality G: Optimality G: Generality Example: Figure-Ground Segmentation Independent decisions Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 21. Prediction Problem G: Generality G: Optimality G: Generality Example: Figure-Ground Segmentation g (x, y , w ) = log p(yi |xi ) + w C (xi , xj )I (yi = yj ) i∈V (i,j)∈E Gradient strength 2 C (xi , xj ) = exp(γ xi − xj ) γ estimated from mean edge strength (Blake et al, 2004) w ≥ 0 controls smoothing Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 22. Prediction Problem G: Generality G: Optimality G: Generality Example: Figure-Ground Segmentation w =0 Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 23. Prediction Problem G: Generality G: Optimality G: Generality Example: Figure-Ground Segmentation Small w > 0 Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 24. Prediction Problem G: Generality G: Optimality G: Generality Example: Figure-Ground Segmentation Medium w > 0 Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 25. Prediction Problem G: Generality G: Optimality G: Generality Example: Figure-Ground Segmentation Large w > 0 Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 26. Prediction Problem G: Generality G: Optimality G: Generality General Binary Case Is there a larger class of energies for which binary graph cuts are applicable? (Kolmogorov and Zabih, 2004), (Freedman and Drineas, 2005) Theorem (Regular Binary Energies) E (y ; x, w ) = EF (yF ; x, wtF ) + EF (yF ; x, wtF ) F ∈F1 F ∈F2 is a energy function of binary variables containing only unary and pairwise factors. The discrete energy minimization problem argminy E (y ; x, w ) is representable as a graph cut problem if and only if all pairwise energy functions EF for F ∈ F2 with F = {i, j} satisfy Ei,j (0, 0) + Ei,j (1, 1) ≤ Ei,j (0, 1) + Ei,j (1, 0). Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 27. Prediction Problem G: Generality G: Optimality G: Generality General Binary Case Is there a larger class of energies for which binary graph cuts are applicable? (Kolmogorov and Zabih, 2004), (Freedman and Drineas, 2005) Theorem (Regular Binary Energies) E (y ; x, w ) = EF (yF ; x, wtF ) + EF (yF ; x, wtF ) F ∈F1 F ∈F2 is a energy function of binary variables containing only unary and pairwise factors. The discrete energy minimization problem argminy E (y ; x, w ) is representable as a graph cut problem if and only if all pairwise energy functions EF for F ∈ F2 with F = {i, j} satisfy Ei,j (0, 0) + Ei,j (1, 1) ≤ Ei,j (0, 1) + Ei,j (1, 0). Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 28. Prediction Problem G: Generality G: Optimality G: Generality Example: Class-independent Object Hypotheses (Carreira and Sminchisescu, 2010) PASCAL VOC 2009/2010 segmentation winner Generate class-independent object hypotheses Energy (almost) as before g (x, y , w ) = Ei (yi ) + w C (xi , xj )I (yi = yj ) i∈V (i,j)∈E Fixed unaries   ∞ if i ∈ Vfg and yi = 0 Ei (yi ) = ∞ if i ∈ Vbg and yi = 1 0 otherwise  Test all w ≥ 0 using parametric max-flow (Picard and Queyranne, 1980), (Kolmogorov et al., 2007) Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 29. Prediction Problem G: Generality G: Optimality G: Generality Example: Class-independent Object Hypotheses (Carreira and Sminchisescu, 2010) PASCAL VOC 2009/2010 segmentation winner Generate class-independent object hypotheses Energy (almost) as before g (x, y , w ) = Ei (yi ) + w C (xi , xj )I (yi = yj ) i∈V (i,j)∈E Fixed unaries   ∞ if i ∈ Vfg and yi = 0 Ei (yi ) = ∞ if i ∈ Vbg and yi = 1 0 otherwise  Test all w ≥ 0 using parametric max-flow (Picard and Queyranne, 1980), (Kolmogorov et al., 2007) Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 30. Prediction Problem G: Generality G: Optimality G: Generality Example: Class-independent Object Hypotheses (Carreira and Sminchisescu, 2010) PASCAL VOC 2009/2010 segmentation winner Generate class-independent object hypotheses Energy (almost) as before g (x, y , w ) = Ei (yi ) + w C (xi , xj )I (yi = yj ) i∈V (i,j)∈E Fixed unaries   ∞ if i ∈ Vfg and yi = 0 Ei (yi ) = ∞ if i ∈ Vbg and yi = 1 0 otherwise  Test all w ≥ 0 using parametric max-flow (Picard and Queyranne, 1980), (Kolmogorov et al., 2007) Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 31. Prediction Problem G: Generality G: Optimality G: Generality Example: Class-independent Object Hypotheses (cont) Input image (http://pdphoto.org) Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 32. Prediction Problem G: Generality G: Optimality G: Generality Example: Class-independent Object Hypotheses (cont) CPMC proposal segmentations (Carreira and Sminchisescu, 2010) Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 33. Prediction Problem G: Generality G: Optimality G: Optimality Hard problem Generality Optimality Worst-case Integrality Determinism complexity Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 34. Prediction Problem G: Generality G: Optimality G: Optimality Giving up Optimality Solving for y ∗ is hard, but is it necessary? pragmatic motivation: in many applications a close-to-optimal solution is good enough computational motivation: set of “good” solutions might be large and finding just one element can be easy For machine learning models modeling error: we always use the wrong model estimation error: preference for y ∗ might be an artifact Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 35. Prediction Problem G: Generality G: Optimality G: Optimality Giving up Optimality Solving for y ∗ is hard, but is it necessary? pragmatic motivation: in many applications a close-to-optimal solution is good enough computational motivation: set of “good” solutions might be large and finding just one element can be easy For machine learning models modeling error: we always use the wrong model estimation error: preference for y ∗ might be an artifact Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 36. Prediction Problem G: Generality G: Optimality G: Optimality Local Search Y y0 Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 37. Prediction Problem G: Generality G: Optimality G: Optimality Local Search Y y0 N (y 0 ) Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 38. Prediction Problem G: Generality G: Optimality G: Optimality Local Search Y y0 y1 N (y 0 ) Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 39. Prediction Problem G: Generality G: Optimality G: Optimality Local Search Y y0 y1 y2 y3 N (y 3 ) N (y 0 ) N (y 1 ) N (y 2 ) y∗ ∗ N (y ) Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 40. Prediction Problem G: Generality G: Optimality G: Optimality Local Search Y y0 y1 y2 y3 N (y 3 ) N (y 0 ) N (y 1 ) N (y 2 ) y∗ N (y ∗ ) Nt : Y → 2Y , neighborhood system Optimization with respect to Nt (y ) must be tractable: y t+1 = argmax g (x, y ) y ∈Nt (y t ) Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 41. Prediction Problem G: Generality G: Optimality G: Optimality Example: Iterated Conditional Modes (ICM) Iterated Conditional Modes (ICM), (Besag, 1986) g (x, y ) = log p(y |x) y ∗ = argmax log p(y |x) y ∈Y Neighborhoods Ns (y ) = {(y1 , . . . , ys−1 , zs , ys+1 , . . . , yS )|zs ∈ Ys } Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 42. Prediction Problem G: Generality G: Optimality G: Optimality Example: Iterated Conditional Modes (ICM) Iterated Conditional Modes (ICM), (Besag, 1986) y t+1 = argmax log p(y1 , y2 , . . . , yV |x) t t y1 ∈Y1 Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 43. Prediction Problem G: Generality G: Optimality G: Optimality Example: Iterated Conditional Modes (ICM) Iterated Conditional Modes (ICM), (Besag, 1986) y t+1 = argmax log p(y1 , y2 , y3 , . . . , yV |x) t t t y2 ∈Y2 Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 44. Prediction Problem G: Generality G: Optimality G: Optimality Neighborhood Size ICM neighborhood Nt (y t ): all states reachable from y t by changing a single variable (Besag, 1986) Neighborhood size: in general, larger is better (VLSN, Ahuja, 2000) Example: neighborhood along chains Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 45. Prediction Problem G: Generality G: Optimality G: Optimality Example: Block ICM Block Iterated Conditional Modes (ICM) (Kelm et al., 2006), (Kittler and F¨glein, 1984) o t y t+1 = argmax log p(yC1 , yV C1 |x) yC1 ∈YC1 Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 46. Prediction Problem G: Generality G: Optimality G: Optimality Example: Block ICM Block Iterated Conditional Modes (ICM) (Kelm et al., 2006), (Kittler and F¨glein, 1984) o t y t+1 = argmax log p(yC2 , yV C2 |x) yC2 ∈YC2 Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 47. Prediction Problem G: Generality G: Optimality G: Optimality Example: Multilabel Graph-Cut Binary graph-cuts are not applicable to multilabel energy minimization problems (Boykov et al., 2001): two local search algorithms for multilabel problems Sequence of binary directed s-t-mincut problems Iteratively improve multilabel solution Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 48. Prediction Problem G: Generality G: Optimality G: Optimality α-β Swap Neighborhood Select two different labels α and β Fix all variables i for which yi ∈ {α, β} / Optimize over remaining i with yi ∈ {α, β} Nα,β : Y × N × N → 2Y , Nα,β (y , α, β) := {z ∈ Y : zi = yi if yi ∈ {α, β}, / otherwise zi ∈ {α, β}}. Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 49. Prediction Problem G: Generality G: Optimality G: Optimality α-β-swap illustrated 5-label problem α − β-swap Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 50. Prediction Problem G: Generality G: Optimality G: Optimality α-β-swap illustrated 5-label problem α − β-swap Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 51. Prediction Problem G: Generality G: Optimality G: Optimality α-β-swap illustrated 5-label problem α − β-swap Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 52. Prediction Problem G: Generality G: Optimality G: Optimality α-β-swap illustrated 5-label problem α − β-swap Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 53. Prediction Problem G: Generality G: Optimality G: Optimality α-β-swap illustrated 5-label problem α − β-swap Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 54. Prediction Problem G: Generality G: Optimality G: Optimality α-β-swap derivation y t+1 = argmin E (y ; x) y ∈Nα,β (y t ,α,β) Constant: drop out Unary: combine Pairwise: binary pairwise Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 55. Prediction Problem G: Generality G: Optimality G: Optimality α-β-swap derivation y t+1 = argmin Ei (yi ; x) + Ei,j (yi , yj ; x) y ∈Nα,β (y t ,α,β) i∈V (i,j)∈E Constant: drop out Unary: combine Pairwise: binary pairwise Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 56. Prediction Problem G: Generality G: Optimality G: Optimality α-β-swap derivation y t+1 = argmin Ei (yit ; x) + Ei (yi ; x) y ∈Nα,β (y t ,α,β) i∈V , i∈V , yit ∈{α,β} / yit ∈{α,β} + Ei,j (yit , yjt ; x) + Ei,j (yi , yjt ; x) (i,j)∈E , (i,j)∈E , yit ∈{α,β},yjt ∈{α,β} / / yit ∈{α,β},yjt ∈{α,β} / + Ei,j (yit , yj ; x) + Ei,j (yi , yj ; x) . (i,j)∈E , (i,j)∈E , yit ∈{α,β},yjt ∈{α,β} / yit ∈{α,β},yjt ∈{α,β} Constant: drop out Unary: combine Pairwise: binary pairwise Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 57. Prediction Problem G: Generality G: Optimality G: Optimality α-β-swap derivation y t+1 = argmin Ei (yit ; x) + Ei (yi ; x) y ∈Nα,β (y t ,α,β) i∈V , i∈V , yit ∈{α,β} / yit ∈{α,β} + Ei,j (yit , yjt ; x) + Ei,j (yi , yjt ; x) (i,j)∈E , (i,j)∈E , yit ∈{α,β},yjt ∈{α,β} / / yit ∈{α,β},yjt ∈{α,β} / + Ei,j (yit , yj ; x) + Ei,j (yi , yj ; x) . (i,j)∈E , (i,j)∈E , yit ∈{α,β},yjt ∈{α,β} / yit ∈{α,β},yjt ∈{α,β} Constant: drop out Unary: combine Pairwise: binary pairwise Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 58. Prediction Problem G: Generality G: Optimality G: Optimality α-β-swap graph construction Directed graph G = (V , E ) tα α tα V = {α, β} ∪ {i ∈ V : yi ∈ {α, β}}, i tα k j E = {(α, i, tiα ) : ∀i ∈ V : yi ∈ {α, β}} ∪ ni,j {(i, β, tiβ ) : ∀i ∈ V : yi ∈ {α, β}} ∪ i j ... k ni,j {(i, j, ni,j ) : ∀(i, j), (j, i) ∈ E : yi , yj ∈ {α, β}}. tβ j tβ tβ i β k Edge weights tiα , tiβ , and ni,j ni,j = Ei,j (α, β; x) tiα = Ei (α; x) + Ei,j (α, yj ; x) (i,j)∈E, yj ∈{α,β} / tiβ = Ei (β; x) + Ei,j (β, yj ; x) (i,j)∈E, yj ∈{α,β} / Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 59. Prediction Problem G: Generality G: Optimality G: Optimality α-β-swap graph construction Directed graph G = (V , E ) tα α tα V = {α, β} ∪ {i ∈ V : yi ∈ {α, β}}, i tα k j E = {(α, i, tiα ) : ∀i ∈ V : yi ∈ {α, β}} ∪ ni,j {(i, β, tiβ ) : ∀i ∈ V : yi ∈ {α, β}} ∪ i j ... k ni,j {(i, j, ni,j ) : ∀(i, j), (j, i) ∈ E : yi , yj ∈ {α, β}}. tβ j tβ tβ i β k Edge weights tiα , tiβ , and ni,j ni,j = Ei,j (α, β; x) tiα = Ei (α; x) + Ei,j (α, yj ; x) (i,j)∈E, yj ∈{α,β} / tiβ = Ei (β; x) + Ei,j (β, yj ; x) (i,j)∈E, yj ∈{α,β} / Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 60. Prediction Problem G: Generality G: Optimality G: Optimality α-β-swap move tα i α tα k tα j ni,j i j ... k ni,j tβ j tβ tβ i β k Side of cut determines yi ∈ {α, β} Iterate all possible (α, β) combinations Semi-metric requirement on pairwise energies Ei,j (yi , yj ; x) = 0 ⇔ yi = yj Ei,j (yi , yj ; x) = Ei,j (yj , yi ; x) ≥ 0 Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 61. Prediction Problem G: Generality G: Optimality G: Optimality α-β-swap move C α i j ... k β Side of cut determines yi ∈ {α, β} Iterate all possible (α, β) combinations Semi-metric requirement on pairwise energies Ei,j (yi , yj ; x) = 0 ⇔ yi = yj Ei,j (yi , yj ; x) = Ei,j (yj , yi ; x) ≥ 0 Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 62. Prediction Problem G: Generality G: Optimality G: Optimality α-β-swap move C α i j ... k β Side of cut determines yi ∈ {α, β} Iterate all possible (α, β) combinations Semi-metric requirement on pairwise energies Ei,j (yi , yj ; x) = 0 ⇔ yi = yj Ei,j (yi , yj ; x) = Ei,j (yj , yi ; x) ≥ 0 Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 63. Prediction Problem G: Generality G: Optimality G: Optimality Example: Stereo Disparity Estimation Infer depth from two images Discretized multi-label problem α-expansion solution close to optimal Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 64. Prediction Problem G: Generality G: Optimality G: Optimality Example: Stereo Disparity Estimation Infer depth from two images Discretized multi-label problem α-expansion solution close to optimal Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 65. Prediction Problem G: Generality G: Optimality G: Optimality Model Reduction Energy minimization problem: many decision to make jointly Model reduction 1. Fix a subset of decisions 2. Optimize the smaller remaining model Example: forcing yi = yj for pairs (i, j) Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 66. Prediction Problem G: Generality G: Optimality G: Optimality Example: Superpixels in Labeling Problems Input image: 500-by-375 pixels (187,500 decisions) Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)
  • 67. Prediction Problem G: Generality G: Optimality G: Optimality Example: Superpixels in Labeling Problems Image with 149 superpixels (149 decisions) Sebastian Nowozin and Christoph H. Lampert Part 6: Structured Prediction and Energy Minimization (1/2)