SlideShare una empresa de Scribd logo
1 de 64
Descargar para leer sin conexión
Course Program
9.30-10.00 Introduction (Andrew Blake)
10.00-11.00 Discrete Models in Computer Vision (Carsten Rother)
15min Coffee break
11.15-12.30 Message Passing: DP, TRW, LP relaxation (Pawan Kumar)
12.30-13.00 Quadratic pseudo-boolean optimization (Pushmeet Kohli)
1 hour Lunch break
14:00-15.00 Transformation and move-making methods (Pushmeet Kohli)
15:00-15.30 Speed and Efficiency (Pushmeet Kohli)
15min Coffee break
15:45-16.15 Comparison of Methods (Carsten Rother)
16:15-17.30 Recent Advances: Dual-decomposition, higher-order, etc.
             (Carsten Rother + Pawan Kumar)

   All online material will be online (after conference):
   http://research.microsoft.com/en-us/um/cambridge/projects/tutorial/
Discrete Models in Computer Vision

              Carsten Rother
       Microsoft Research Cambridge
Overview

• Introduce Factor graphs notation
• Categorization of models in Computer Vision:
  – 4-connected MRFs
  – Highly-connected MRFs
  – Higher-order MRFs
Markov Random Field Models for Computer Vision
                                        Inference:
Model :                                    Graph Cut (GC)
   discrete or continuous variables?      Belief Propagtion (BP)
   discrete or continuous space?          Tree-Reweighted Message Parsing
   Dependence between variables?           (TRW)
   …
                                           Iterated Conditional Modes (ICM)
                                           Cutting-plane
                                           Dual-decomposition
Applications:
   2D/3D Image segmentation               …
   Object Recognition
   3D reconstruction
   Stereo matching                     Learning:
   Image denoising                        Exhaustive search (grid search)
   Texture Synthesis                      Pseudo-Likelihood approximation
   Pose estimation
                                           Training in Pieces
   Panoramic Stitching
   …                                      Max-margin
                                           …
Recap: Image Segmentation


      Input z             Maximum-a-posteriori (MAP)
                          x* = argmax P(x|z) = argmin E(x)
                                  x                x


P(x|z) ~ P(z|x) P(x) Posterior; Likelihood; Prior

P(x|z) ~ exp{-E(x)}                Gibbs distribution

E: {0,1}n → R
E(x) = ∑ θi (xi) + w∑ θij (xi,xj)            Energy
         i          i,j   Є   N4
       Unary term    Pairwise term
Min-Marginals
       (uncertainty of MAP-solution)

Definition: ψv;i = min E(x)
                    xv=i




  image                    MAP          Min-Marginals
                                         (foreground)
                                     (bright=very certain)
  Can be used in several ways:
  • Insights on the model
  • For optimization (TRW, comes later)
Introducing Factor Graphs
Write probability distributions as Graphical models:

       - Direct graphical model
       - Undirected graphical model (… what Andrew Blake used)
       - Factor graphs

References:
       - Pattern Recognition and Machine Learning *Bishop ‘08, book, chapter 8+
       - several lectures at the Machine Learning Summer School 2009
         (see video lectures)
Factor Graphs
P(x) ~ θ(x1,x2,x4) θ(x3,x4) θ(x2,x3) θ(x4,x5)          “4 factors”

P(x) ~ exp{-E(x)}                                       Gibbs distribution
E(x) = θ(x1,x2,x3) + θ(x2,x4) + θ(x3,x4) + θ(x3,x5)


                              unobserved/latent/hidden variable
     x1           x2
                              variables are in same factor.



     x3           x4



      x5

      Factor graph
Definition “Order”
Definition “Order”:
The arity (number of variables) of
the largest factor                                    x1           x2


      P(X) ~ θ(x1,x2,x3) θ(x2,x4) θ(x3,x4) θ(x3,x5)

                 arity 3             arity 2          x3           x4



                                                      x5

                                                           Factor graph
                                                           with order 3
Examples - Order




 4-connected;          higher(8)-connected;    Higher-order MRF
 pairwise MRF          pairwise MRF

E(x) = ∑ θij (xi,xj)   E(x) = ∑ θij (xi,xj)   E(x) = ∑ θij (xi,xj)
                                                    i,j Є N4
      i,j Є N4               i,j Є N8
                                                          +θ(x1,…,xn)
     Order 2                  Order 2               Order n
“Pairwise energy”                             “higher-order energy”
Example: Image segmentation
      P(x|z) ~ exp{-E(x)}
         E(x) = ∑ θi (xi,zi) + ∑ θij (xi,xj)
                    i        i,j   Є   N4



                                       Observed variable

                                       Unobserved (latent) variable
               zi




    xj         xi

Factor graph
Most simple inference technique:
     ICM (iterated conditional mode)
                        Goal: x* = argmin E(x)
        x2                              x
                        E(x) = θ12 (x1,x2)+ θ13 (x1,x3)+
                              θ14 (x1,x4)+ θ15 (x1,x5)+…

x3      x1    x4



        x5
Most simple inference technique:
         ICM (iterated conditional mode)
                  means observed
                                              Goal: x* = argmin E(x)
                x2                                            x
                                              E(x) = θ12 (x1,x2)+ θ13 (x1,x3)+
                                                    θ14 (x1,x4)+ θ15 (x1,x5)+…

    x3          x1          x4



                x5




Simulated Annealing: accept a move even if       ICM                 Global min
energy increases (with certain probability)     Can get stuck in local minima!
Overview
• Introduce Factor graphs notation
• Categorization of models in Computer Vision:

  – 4-connected MRFs

  – Highly-connected MRFs

  – Higher-order MRFs
Stereo matching
                                     d=0

                               d=4



 Image – left(a)     Image – right(b)      Ground truth depth


• Images rectified
• Ignore occlusion for now

Energy:                                         di

    E(d): {0,…,D-1}n → R
    Labels: d (depth/shift)
Stereo matching - Energy
Energy:
    E(d): {0,…,D-1}n → R
    E(d) = ∑ θi (di) + ∑ θij (di,dj)
                i          i,j Є N4

Unary:
         θi (di) = (lj-ri-di)      many others
         “SAD; Sum of absolute differences”




                                                 Right Image
         (many others possible, NCC,…)

                                left
                    i
                                right
                 i-2
               (di=2)                                          Left Image
 Pairwise:
         θij (di,dj) = g(|di-dj|)
Stereo matching - prior

                            θij (di,dj) = g(|di-dj|)




                           cost
                                  |di-dj|




No truncation
(global min.)


                                      [Olga Veksler PhD thesis,
                                      Daniel Cremers et al.]
Stereo matching - prior

                                            θij (di,dj) = g(|di-dj|)




                                           cost
                                                  |di-dj|
                                         discontinuity preserving potentials
                                              *Blake&Zisserman’83,’87+


No truncation      with truncation
(global min.)   (NP hard optimization)


                                                      [Olga Veksler PhD thesis,
                                                      Daniel Cremers et al.]
Stereo matching
    see http://vision.middlebury.edu/stereo/




         No MRF                    No horizontal links
Pixel independent (WTA)   Efficient since independent chains




     Pairwise MRF                   Ground truth
   [Boykov et al. ‘01+
Texture synthesis


    Input

                Output

                         Good case:               Bad case:
                b            b                     b
                             a                     a
                                 i   j                  i    j
            a

                    E: {0,1}n → R
O                   E(x) =   ∑ |xi-xj|      [ |ai-bi|+|aj-bj| ]
                         i,j Є N4
    1
                                         [Kwatra et. al. Siggraph ‘03 +
Video Synthesis




Input           Output




                Video (duplicated)


        Video
Panoramic stitching
Panoramic stitching
AutoCollage




http://research.microsoft.com/en-us/um/cambridge/projects/autocollage/   [Rother et. al. Siggraph ‘05 +
Recap: 4-connected MRFs

• A lot of useful vision systems are based on
  4-connected pairwise MRFs.

• Possible Reason (see Inference part):
  a lot of fast and good (globally optimal)
  inference methods exist
Overview
• Introduce Factor graphs notation
• Categorization of models in Computer Vision?

  – 4-connected MRFs

  – Highly-connected MRFs

  – Higher-order MRFs
Why larger connectivity?
We have seen…
• “Knock-on” effect (each pixel influences each other pixel)
• Many good systems

What is missing:
1. Modelling real-world texture (images)
2. Reduce discretization artefacts
3. Encode complex prior knowledge
4. Use non-local parameters
Reason 1: Texture modelling



Training images   Test image     Test image (60% Noise)




  Result MRF      Result MRF         Result MRF
 4-connected      4-connected        9-connected
 (neighbours)                   (7 attractive; 2 repulsive)
Reason1: Texture Modelling
 input         output




                         [Zalesny et al. ‘01+
Reason2: Discretization artefacts

1
                         Length of the paths:

                      Eucl.       4-con.       8-con.
                        5.65       6.28         5.08
                         8         6.28         6.75

                 Larger connectivity can model true Euclidean
                      length (also other metric possible)




                                              *Boykov et al ‘03, ‘05+
Reason2: Discretization artefacts




   4-connected             8-connected           8-connected
    Euclidean               Euclidean              geodesic

                 higher-connectivity can model
                     true Euclidean length       [Boykov et al. ‘03; ‘05+
3D reconstruction




               [Slide credits: Daniel Cremers]
Reason 3: Encode complex prior knowledge:
                      Stereo with occlusion




       E(d): {1,…,D}2n → R
       Each pixel is connected to D pixels in the other image
                                                      d=1 (∞ cost)
                 1                 1

                                                                 d=10 (match)
θlr (dl,dr) =   d                    d
                      match
                                                                     d=20 (0 cost)

                D                    D
                dl              dr           Left view               right view
Stereo with occlusion




Ground truth   Stereo with occlusion    Stereo without occlusion
               [Kolmogrov et al. ‘02+      *Boykov et al. ‘01+
Reason 4: Use Non-local parameters:
   Interactive Segmentation (GrabCut)




                    [Boykov and Jolly ’01+




                  GrabCut *Rother et al. ’04+
A meeting with the Queen
Reason 4: Use Non-local parameters:
         Interactive Segmentation (GrabCut)


                                                    w
 Model jointly segmentation and color model:

 E(x,w): {0,1}n x {GMMs}→ R
 E(x,w) = ∑ θi (xi,w) + ∑ θij (xi,xj)
           i           i,j Є N4


An object is a compact set of colors:
                                  Red
   Red




                                               [Rother et al Siggraph ’04+
Reason 4: Use Non-local parameters:
                 Segmentation and Recognition
Goal, Segment test image:

                 1


Large set of example segmentation:




             T(1)                 T(2)                  T(3)
                        Up to 2.000.000 exemplars

               E(x,w): {0,1}n x {Exemplar}→ R
               E(x,w) = ∑ |T(w)i-xi| + ∑ θij (xi,xj)
                             i               i,j Є N4
                            “Hamming distance”

                                                               [Lempisky et al. ECCV ’08+
Reason 4: Use Non-local parameters:
     Segmentation and Recognition




            UIUC dataset; 98.8% accuracy

                                           [Lempisky et al. ECCV ’08+
Overview
• Introduce Factor graphs notation
• Categorization of models in Computer Vision?

  – 4-connected MRFs

  – Highly-connected MRFs

  – Higher-order MRFs
Why Higher-order Functions?
In general θ(x1,x2,x3) ≠ θ(x1,x2) + θ(x1,x3) + θ(x2,x3)

Reasons for higher-order MRFs:

1. Even better image(texture) models:
   –   Field-of Expert [FoE, Roth et al. ‘05+
   –   Curvature *Woodford et al. ‘08+

2. Use global Priors:
   –   Connectivity *Vicente et al. ‘08, Nowizin et al. ‘09+
   –   Encode better training statistics *Woodford et al. ‘09+
Reason1: Better Texture Modelling


                            Higher Order Structure
                                 not Preserved

Training images




 Test Image       Test Image (60% Noise) Result pairwise MRF    Higher-order MRF
                                             9-connected
                                                               *Rother et al CVPR ‘09+
Reason 2: Use global Prior
Foreground object must be connected:




          User input    Standard MRF:          with connectivity
                        Removes noise (+)
                        Shrinks boundary (-)


  E(x) = P(x) + h(x)   with h(x)= {
                                  ∞ if not 4-connected
                                  0 otherwise
                                                            [Vicente et. al. ’08
                                                            Nowizin et al ‘09+
Reason 2: Use global Prior
    What is the prior of a MAP-MRF solution:

   Training image:                                              60% black, 40% white

   MAP:                                  Others less likely :
                        8
       prior(x) = 0.6 = 0.016                              prior(x) = 0.6 5 * 0.4 3 = 0.005
               MRF is a bad prior since input marginal statistic ignored !

               Introduce a global term, which controls global stats:




               Noisy input



Ground truth                              Pairwise MRF –               Global gradient prior
                                      Increase Prior strength
                                                                         [Woodford et. al. ICCV ‘09]
                                                                         (see poster on Friday)
Summary
• Introduce Factor graphs notation
• Categorization of models in Computer Vision?

  – 4-connected MRFs

  – Highly-connected MRFs

  – Higher-order MRFs


  …. all useful models,
     but how do I optimize them?
Course Program
9.30-10.00 Introduction (Andrew Blake)
10.00-11.00 Discrete Models in Computer Vision (Carsten Rother)
15min Coffee break
11.15-12.30 Message Passing: DP, TRW, LP relaxation (Pawan Kumar)
12.30-13.00 Quadratic pseudo-boolean optimization (Pushmeet Kohli)
1 hour Lunch break
14:00-15.00 Transformation and move-making methods (Pushmeet Kohli)
15:00-15.30 Speed and Efficiency (Pushmeet Kohli)
15min Coffee break
15:45-16.15 Comparison of Methods (Carsten Rother)
16:15-17.30 Recent Advances: Dual-decomposition, higher-order, etc.
             (Carsten Rother + Pawan Kumar)

   All online material will be online (after conference):
   http://research.microsoft.com/en-us/um/cambridge/projects/tutorial/
END
unused slides …
Markov Property


                                  xi




• Markov Property: Each variable is only connected to a few others,
                     i.e. many pixels are conditional independent
• This makes inference easier (possible at all)
• But still… every pixel can influence any other pixel (knock-on effect)
Recap: Factor Graphs


• Factor graphs: very good representation since it
                 reflects directly the given energy

• MRF (Markov Property) means many pixels are
                        conditional independent

• Still … all pixels influence each other (knock-on effect)
Interactive Segmentation - Tutorial example


                              Goal



   z = (R,G,B)n                                 x = {0,1}n

    Given Z and unknown (latent) variables x:

    P(x|z) =       P(z|x)     P(x)    / P(z)     ~ P(z|x) P(x)
    Posterior      Likelihood    Prior
    Probability       (data-    (data-
                  dependent) independent)

   Maximium a Posteriori (MAP): x* = argmax P(x|z)
                                                X
Likelihood   P(x|z) ~ P(z|x) P(x)




                              Green
Green




              Red                       Red
Likelihood             P(x|z) ~ P(z|x) P(x)




   Log P(zi|xi=0)                Log P(zi|xi=1)


Maximum likelihood:
x* = argmax P(z|x) =
        X
argmax ∏ P(zi|xi)
   x    xi
Prior   P(x|z) ~ P(z|x) P(x)




                                 xi      xj


P(x) = 1/f ∏ θij (xi,xj)
                   i,j   Є   N

f = ∑     ∏ θij (xi,xj)           “partition function”
    x    i,j   Є   N

θij (xi,xj) = exp{-|xi-xj|}           “ising prior”

    (exp{-1}=0.36; exp{0}=1)
Posterior distribution
P(x|z) ~ P(z|x) P(x)

  Posterior “Gibbs” distribution:

  θi (xi,zi) = P(zi|xi=1) xi + P(zi|xi=0) (1-xi) Likelihood
  θ (x ,x ) = |x -x |                         prior
    ij   i   j           i   j


  P(x|z) = 1/f(z,w) exp{-E(x,z,w)}
  f(z,w) = ∑ exp{-E(x,z,w)}
                 X
  E(x,z,w) =         ∑ θi (xi,zi)   + w∑ θij (xi,xj) Energy
                     i                  i,j
                   Unary terms        Pairwise terms
 Note, likelihood can be an arbitrary function of the data
Energy minization
 P(x|z) = 1/f(z,w) exp{-E(x,z,w)}
-log P(x|z) = -log (1/f(z,w)) + E(x,z,w)
f(z,w) = ∑ exp{-E(x,z,w)}
         X

 x* = argmin E(x,z,w)
          X
   MAP same as minimum Energy




                     MAP; Global min E     ML
Weight prior and likelihood



     w =0                        w =10




    w =40                       w =200

E(x,z,w) =   ∑ θi (xi,zi)   + w∑ θij (xi,xj)
Moving away from a pure prior …



                                  ising cost                   Contrast Cost

E(x,z,w) =   ∑ θi (xi,zi)   + w   ∑ θij (xi,xj,zi,zj)
              i                   i,j

θij (xi,xj,zi,zj) = |xi-xj| (-exp{-ß||zi-zj||2})


                                                        cost
          ß=2(Mean(||zi-zj||2) )-1

          “Going from a Markov random Field to                           ||zi-zj||2
          Conditional random field”
Tree vs Loopy graphs
                             Markov blanket of xi:
                             all variables which are
                             in same factor as xi         [Felzenschwalb, Huttenlocher ‘01+




                xi                                root




- MAP (in general) NP hard                        tree                       chain
  (see inference part)
- Marginals P(xi) also NP hard           • MAP is tractable
                                         • Marginal, e.g. P(foot), tractable
Stereo matching - prior
                                       θij (di,dj) = g(|di-dj|)




                                     cost
 Left image
                                            |di-dj|


                                                 Potts model




(Potts model)   Smooth disparities


                                               [Olga Veksler PhD thesis]
Modelling texture [Zalesny et al ‘01+
                                    “Unary only”




                                         “8 connected
                                         MRF”



input
                                           “13 connected
                                           MRF”
Reason2: Discretization artefacts
θij (xi,xj) = ∆a / (2*dis(xi,xj)) |xi –xj|             √2       ∆a = π/4

1                                                       1

                                               Length of the path
                                             4-con.    8-con.       true euc.
                                               6.28     5.08          5.65
                                               6.28     6.75           8




       Larger connectivity can model true Euclidean length
       (also any Riemannian metric, e.g. geodesic length, can be
       modelled)
                                                                    *Boykov et al ‘03, ‘05+
References Higher-order Functions?
• In general θ(x1,x2,x3) ≠ θ(x1,x2) + θ(x1,x3) + θ(x2,x3)

   Field of Experts Model (2x2; 5x5)
   *Roth, Black CVPR ‘05 +
   [Potetz, CVPR ‘07+

   Minimize Curvature (3x1)
   *Woodford et al. CVPR ‘08 +

   Large Neighbourhood (10x10 -> whole image)
   *Rother, Kolmogorov, Minka & Blake, CVPR ‘06+
   *Vicente, Kolmogorov, Rother, CVPR ‘08+
   [Komodiakis, Paragios, CVPR ‘09+
   *Rother, Kohli, Feng, Jia, CVPR ‘09+
   *Woodford, Rother, Kolmogorov, ICCV ‘09+
   *Vicente, Kolmogorov, Rother, ICCV ‘09+
   *Ishikawa, CVPR ‘09+
   *Ishikawa, ICCV ‘09+
Conditional Random Field (CRF)
E(x) = ∑ θi (xi,zi) + ∑ θij (xi,xj,zi,zj)                      θij
        i              i,j   Є   N4

       with θij (xi,xj,zi,zj) = |xi-xj| exp(-ß||zi-zj||2)

                                                                                ||zi-zj||2


                      zjj

                zii
                z


                      xjj


                xji                       Ising cost                 Contrast Cost

 Factor graph
                      Definition CRF: all factors may depend on the data z
                      No problem for inference (but parameter learning)

Más contenido relacionado

La actualidad más candente

Complex and Social Network Analysis in Python
Complex and Social Network Analysis in PythonComplex and Social Network Analysis in Python
Complex and Social Network Analysis in Pythonrik0
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network Analysisrik0
 
Quantitative Propagation of Chaos for SGD in Wide Neural Networks
Quantitative Propagation of Chaos for SGD in Wide Neural NetworksQuantitative Propagation of Chaos for SGD in Wide Neural Networks
Quantitative Propagation of Chaos for SGD in Wide Neural NetworksValentin De Bortoli
 
Learning Sparse Representation
Learning Sparse RepresentationLearning Sparse Representation
Learning Sparse RepresentationGabriel Peyré
 
Mesh Processing Course : Active Contours
Mesh Processing Course : Active ContoursMesh Processing Course : Active Contours
Mesh Processing Course : Active ContoursGabriel Peyré
 
Lecture 03: Machine Learning for Language Technology - Linear Classifiers
Lecture 03: Machine Learning for Language Technology - Linear ClassifiersLecture 03: Machine Learning for Language Technology - Linear Classifiers
Lecture 03: Machine Learning for Language Technology - Linear ClassifiersMarina Santini
 
Lesson 27: Integration by Substitution (Section 4 version)
Lesson 27: Integration by Substitution (Section 4 version)Lesson 27: Integration by Substitution (Section 4 version)
Lesson 27: Integration by Substitution (Section 4 version)Matthew Leingang
 
The multilayer perceptron
The multilayer perceptronThe multilayer perceptron
The multilayer perceptronESCOM
 
Lecture 5: Structured Prediction
Lecture 5: Structured PredictionLecture 5: Structured Prediction
Lecture 5: Structured PredictionMarina Santini
 
Decision Trees and Bayes Classifiers
Decision Trees and Bayes ClassifiersDecision Trees and Bayes Classifiers
Decision Trees and Bayes ClassifiersAlexander Jung
 
Output Units and Cost Function in FNN
Output Units and Cost Function in FNNOutput Units and Cost Function in FNN
Output Units and Cost Function in FNNLin JiaMing
 
CVPR2010: Advanced ITinCVPR in a Nutshell: part 5: Shape, Matching and Diverg...
CVPR2010: Advanced ITinCVPR in a Nutshell: part 5: Shape, Matching and Diverg...CVPR2010: Advanced ITinCVPR in a Nutshell: part 5: Shape, Matching and Diverg...
CVPR2010: Advanced ITinCVPR in a Nutshell: part 5: Shape, Matching and Diverg...zukun
 
Uncertainty in deep learning
Uncertainty in deep learningUncertainty in deep learning
Uncertainty in deep learningYujiro Katagiri
 
CS 354 Graphics Math
CS 354 Graphics MathCS 354 Graphics Math
CS 354 Graphics MathMark Kilgard
 
Geodesic Method in Computer Vision and Graphics
Geodesic Method in Computer Vision and GraphicsGeodesic Method in Computer Vision and Graphics
Geodesic Method in Computer Vision and GraphicsGabriel Peyré
 
Low Complexity Regularization of Inverse Problems
Low Complexity Regularization of Inverse ProblemsLow Complexity Regularization of Inverse Problems
Low Complexity Regularization of Inverse ProblemsGabriel Peyré
 

La actualidad más candente (20)

Complex and Social Network Analysis in Python
Complex and Social Network Analysis in PythonComplex and Social Network Analysis in Python
Complex and Social Network Analysis in Python
 
Social Network Analysis
Social Network AnalysisSocial Network Analysis
Social Network Analysis
 
Quantitative Propagation of Chaos for SGD in Wide Neural Networks
Quantitative Propagation of Chaos for SGD in Wide Neural NetworksQuantitative Propagation of Chaos for SGD in Wide Neural Networks
Quantitative Propagation of Chaos for SGD in Wide Neural Networks
 
Learning Sparse Representation
Learning Sparse RepresentationLearning Sparse Representation
Learning Sparse Representation
 
Mesh Processing Course : Active Contours
Mesh Processing Course : Active ContoursMesh Processing Course : Active Contours
Mesh Processing Course : Active Contours
 
Lecture 03: Machine Learning for Language Technology - Linear Classifiers
Lecture 03: Machine Learning for Language Technology - Linear ClassifiersLecture 03: Machine Learning for Language Technology - Linear Classifiers
Lecture 03: Machine Learning for Language Technology - Linear Classifiers
 
Lesson 27: Integration by Substitution (Section 4 version)
Lesson 27: Integration by Substitution (Section 4 version)Lesson 27: Integration by Substitution (Section 4 version)
Lesson 27: Integration by Substitution (Section 4 version)
 
The multilayer perceptron
The multilayer perceptronThe multilayer perceptron
The multilayer perceptron
 
Lecture 5: Structured Prediction
Lecture 5: Structured PredictionLecture 5: Structured Prediction
Lecture 5: Structured Prediction
 
Decision Trees and Bayes Classifiers
Decision Trees and Bayes ClassifiersDecision Trees and Bayes Classifiers
Decision Trees and Bayes Classifiers
 
Output Units and Cost Function in FNN
Output Units and Cost Function in FNNOutput Units and Cost Function in FNN
Output Units and Cost Function in FNN
 
CVPR2010: Advanced ITinCVPR in a Nutshell: part 5: Shape, Matching and Diverg...
CVPR2010: Advanced ITinCVPR in a Nutshell: part 5: Shape, Matching and Diverg...CVPR2010: Advanced ITinCVPR in a Nutshell: part 5: Shape, Matching and Diverg...
CVPR2010: Advanced ITinCVPR in a Nutshell: part 5: Shape, Matching and Diverg...
 
Uncertainty in deep learning
Uncertainty in deep learningUncertainty in deep learning
Uncertainty in deep learning
 
CS 354 Graphics Math
CS 354 Graphics MathCS 354 Graphics Math
CS 354 Graphics Math
 
Sparse autoencoder
Sparse autoencoderSparse autoencoder
Sparse autoencoder
 
16 fft
16 fft16 fft
16 fft
 
Iclr2016 vaeまとめ
Iclr2016 vaeまとめIclr2016 vaeまとめ
Iclr2016 vaeまとめ
 
Geodesic Method in Computer Vision and Graphics
Geodesic Method in Computer Vision and GraphicsGeodesic Method in Computer Vision and Graphics
Geodesic Method in Computer Vision and Graphics
 
Low Complexity Regularization of Inverse Problems
Low Complexity Regularization of Inverse ProblemsLow Complexity Regularization of Inverse Problems
Low Complexity Regularization of Inverse Problems
 
Fourier Transforms
Fourier TransformsFourier Transforms
Fourier Transforms
 

Similar a Discrete Models in Computer Vision

Image Processing 2
Image Processing 2Image Processing 2
Image Processing 2jainatin
 
Object Detection with Discrmininatively Trained Part based Models
Object Detection with Discrmininatively Trained Part based ModelsObject Detection with Discrmininatively Trained Part based Models
Object Detection with Discrmininatively Trained Part based Modelszukun
 
Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9Daisuke Yoneoka
 
Lesson 19: Double Integrals over General Regions
Lesson 19: Double Integrals over General RegionsLesson 19: Double Integrals over General Regions
Lesson 19: Double Integrals over General RegionsMatthew Leingang
 
Applied Machine Learning For Search Engine Relevance
Applied Machine Learning For Search Engine Relevance Applied Machine Learning For Search Engine Relevance
Applied Machine Learning For Search Engine Relevance charlesmartin14
 
Scientific Computing with Python Webinar 9/18/2009:Curve Fitting
Scientific Computing with Python Webinar 9/18/2009:Curve FittingScientific Computing with Python Webinar 9/18/2009:Curve Fitting
Scientific Computing with Python Webinar 9/18/2009:Curve FittingEnthought, Inc.
 
Matrix Computations in Machine Learning
Matrix Computations in Machine LearningMatrix Computations in Machine Learning
Matrix Computations in Machine Learningbutest
 
Image Processing 3
Image Processing 3Image Processing 3
Image Processing 3jainatin
 
15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf
15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf
15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdfMcSwathi
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Valentin De Bortoli
 
Lesson 29: Integration by Substition
Lesson 29: Integration by SubstitionLesson 29: Integration by Substition
Lesson 29: Integration by SubstitionMatthew Leingang
 
On learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodOn learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodFrank Nielsen
 
Lesson 27: Evaluating Definite Integrals
Lesson 27: Evaluating Definite IntegralsLesson 27: Evaluating Definite Integrals
Lesson 27: Evaluating Definite IntegralsMatthew Leingang
 
Lesson 29: Integration by Substition (worksheet solutions)
Lesson 29: Integration by Substition (worksheet solutions)Lesson 29: Integration by Substition (worksheet solutions)
Lesson 29: Integration by Substition (worksheet solutions)Matthew Leingang
 
IVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionIVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionCharles Deledalle
 
Math refresher
Math refresherMath refresher
Math refresherdelilahnan
 

Similar a Discrete Models in Computer Vision (20)

Image Processing 2
Image Processing 2Image Processing 2
Image Processing 2
 
Object Detection with Discrmininatively Trained Part based Models
Object Detection with Discrmininatively Trained Part based ModelsObject Detection with Discrmininatively Trained Part based Models
Object Detection with Discrmininatively Trained Part based Models
 
Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9Murphy: Machine learning A probabilistic perspective: Ch.9
Murphy: Machine learning A probabilistic perspective: Ch.9
 
Lesson 19: Double Integrals over General Regions
Lesson 19: Double Integrals over General RegionsLesson 19: Double Integrals over General Regions
Lesson 19: Double Integrals over General Regions
 
Applied Machine Learning For Search Engine Relevance
Applied Machine Learning For Search Engine Relevance Applied Machine Learning For Search Engine Relevance
Applied Machine Learning For Search Engine Relevance
 
Lect3cg2011
Lect3cg2011Lect3cg2011
Lect3cg2011
 
Scientific Computing with Python Webinar 9/18/2009:Curve Fitting
Scientific Computing with Python Webinar 9/18/2009:Curve FittingScientific Computing with Python Webinar 9/18/2009:Curve Fitting
Scientific Computing with Python Webinar 9/18/2009:Curve Fitting
 
Multitask learning for GGM
Multitask learning for GGMMultitask learning for GGM
Multitask learning for GGM
 
Matrix Computations in Machine Learning
Matrix Computations in Machine LearningMatrix Computations in Machine Learning
Matrix Computations in Machine Learning
 
Image Processing 3
Image Processing 3Image Processing 3
Image Processing 3
 
15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf
15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf
15_wk4_unsupervised-learning_manifold-EM-cs365-2014.pdf
 
talk MCMC & SMC 2004
talk MCMC & SMC 2004talk MCMC & SMC 2004
talk MCMC & SMC 2004
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...
 
Lesson 29: Integration by Substition
Lesson 29: Integration by SubstitionLesson 29: Integration by Substition
Lesson 29: Integration by Substition
 
On learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodOn learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihood
 
Lesson 27: Evaluating Definite Integrals
Lesson 27: Evaluating Definite IntegralsLesson 27: Evaluating Definite Integrals
Lesson 27: Evaluating Definite Integrals
 
Lesson 29: Integration by Substition (worksheet solutions)
Lesson 29: Integration by Substition (worksheet solutions)Lesson 29: Integration by Substition (worksheet solutions)
Lesson 29: Integration by Substition (worksheet solutions)
 
IVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionIVR - Chapter 1 - Introduction
IVR - Chapter 1 - Introduction
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Math refresher
Math refresherMath refresher
Math refresher
 

Discrete Models in Computer Vision

  • 1. Course Program 9.30-10.00 Introduction (Andrew Blake) 10.00-11.00 Discrete Models in Computer Vision (Carsten Rother) 15min Coffee break 11.15-12.30 Message Passing: DP, TRW, LP relaxation (Pawan Kumar) 12.30-13.00 Quadratic pseudo-boolean optimization (Pushmeet Kohli) 1 hour Lunch break 14:00-15.00 Transformation and move-making methods (Pushmeet Kohli) 15:00-15.30 Speed and Efficiency (Pushmeet Kohli) 15min Coffee break 15:45-16.15 Comparison of Methods (Carsten Rother) 16:15-17.30 Recent Advances: Dual-decomposition, higher-order, etc. (Carsten Rother + Pawan Kumar) All online material will be online (after conference): http://research.microsoft.com/en-us/um/cambridge/projects/tutorial/
  • 2. Discrete Models in Computer Vision Carsten Rother Microsoft Research Cambridge
  • 3. Overview • Introduce Factor graphs notation • Categorization of models in Computer Vision: – 4-connected MRFs – Highly-connected MRFs – Higher-order MRFs
  • 4. Markov Random Field Models for Computer Vision Inference: Model :  Graph Cut (GC)  discrete or continuous variables?  Belief Propagtion (BP)  discrete or continuous space?  Tree-Reweighted Message Parsing  Dependence between variables? (TRW)  …  Iterated Conditional Modes (ICM)  Cutting-plane  Dual-decomposition Applications:  2D/3D Image segmentation  …  Object Recognition  3D reconstruction  Stereo matching Learning:  Image denoising  Exhaustive search (grid search)  Texture Synthesis  Pseudo-Likelihood approximation  Pose estimation  Training in Pieces  Panoramic Stitching  …  Max-margin  …
  • 5. Recap: Image Segmentation Input z Maximum-a-posteriori (MAP) x* = argmax P(x|z) = argmin E(x) x x P(x|z) ~ P(z|x) P(x) Posterior; Likelihood; Prior P(x|z) ~ exp{-E(x)} Gibbs distribution E: {0,1}n → R E(x) = ∑ θi (xi) + w∑ θij (xi,xj) Energy i i,j Є N4 Unary term Pairwise term
  • 6. Min-Marginals (uncertainty of MAP-solution) Definition: ψv;i = min E(x) xv=i image MAP Min-Marginals (foreground) (bright=very certain) Can be used in several ways: • Insights on the model • For optimization (TRW, comes later)
  • 7. Introducing Factor Graphs Write probability distributions as Graphical models: - Direct graphical model - Undirected graphical model (… what Andrew Blake used) - Factor graphs References: - Pattern Recognition and Machine Learning *Bishop ‘08, book, chapter 8+ - several lectures at the Machine Learning Summer School 2009 (see video lectures)
  • 8. Factor Graphs P(x) ~ θ(x1,x2,x4) θ(x3,x4) θ(x2,x3) θ(x4,x5) “4 factors” P(x) ~ exp{-E(x)} Gibbs distribution E(x) = θ(x1,x2,x3) + θ(x2,x4) + θ(x3,x4) + θ(x3,x5) unobserved/latent/hidden variable x1 x2 variables are in same factor. x3 x4 x5 Factor graph
  • 9. Definition “Order” Definition “Order”: The arity (number of variables) of the largest factor x1 x2 P(X) ~ θ(x1,x2,x3) θ(x2,x4) θ(x3,x4) θ(x3,x5) arity 3 arity 2 x3 x4 x5 Factor graph with order 3
  • 10. Examples - Order 4-connected; higher(8)-connected; Higher-order MRF pairwise MRF pairwise MRF E(x) = ∑ θij (xi,xj) E(x) = ∑ θij (xi,xj) E(x) = ∑ θij (xi,xj) i,j Є N4 i,j Є N4 i,j Є N8 +θ(x1,…,xn) Order 2 Order 2 Order n “Pairwise energy” “higher-order energy”
  • 11. Example: Image segmentation P(x|z) ~ exp{-E(x)} E(x) = ∑ θi (xi,zi) + ∑ θij (xi,xj) i i,j Є N4 Observed variable Unobserved (latent) variable zi xj xi Factor graph
  • 12. Most simple inference technique: ICM (iterated conditional mode) Goal: x* = argmin E(x) x2 x E(x) = θ12 (x1,x2)+ θ13 (x1,x3)+ θ14 (x1,x4)+ θ15 (x1,x5)+… x3 x1 x4 x5
  • 13. Most simple inference technique: ICM (iterated conditional mode) means observed Goal: x* = argmin E(x) x2 x E(x) = θ12 (x1,x2)+ θ13 (x1,x3)+ θ14 (x1,x4)+ θ15 (x1,x5)+… x3 x1 x4 x5 Simulated Annealing: accept a move even if ICM Global min energy increases (with certain probability) Can get stuck in local minima!
  • 14. Overview • Introduce Factor graphs notation • Categorization of models in Computer Vision: – 4-connected MRFs – Highly-connected MRFs – Higher-order MRFs
  • 15. Stereo matching d=0 d=4 Image – left(a) Image – right(b) Ground truth depth • Images rectified • Ignore occlusion for now Energy: di E(d): {0,…,D-1}n → R Labels: d (depth/shift)
  • 16. Stereo matching - Energy Energy: E(d): {0,…,D-1}n → R E(d) = ∑ θi (di) + ∑ θij (di,dj) i i,j Є N4 Unary: θi (di) = (lj-ri-di) many others “SAD; Sum of absolute differences” Right Image (many others possible, NCC,…) left i right i-2 (di=2) Left Image Pairwise: θij (di,dj) = g(|di-dj|)
  • 17. Stereo matching - prior θij (di,dj) = g(|di-dj|) cost |di-dj| No truncation (global min.) [Olga Veksler PhD thesis, Daniel Cremers et al.]
  • 18. Stereo matching - prior θij (di,dj) = g(|di-dj|) cost |di-dj| discontinuity preserving potentials *Blake&Zisserman’83,’87+ No truncation with truncation (global min.) (NP hard optimization) [Olga Veksler PhD thesis, Daniel Cremers et al.]
  • 19. Stereo matching see http://vision.middlebury.edu/stereo/ No MRF No horizontal links Pixel independent (WTA) Efficient since independent chains Pairwise MRF Ground truth [Boykov et al. ‘01+
  • 20. Texture synthesis Input Output Good case: Bad case: b b b a a i j i j a E: {0,1}n → R O E(x) = ∑ |xi-xj| [ |ai-bi|+|aj-bj| ] i,j Є N4 1 [Kwatra et. al. Siggraph ‘03 +
  • 21. Video Synthesis Input Output Video (duplicated) Video
  • 25. Recap: 4-connected MRFs • A lot of useful vision systems are based on 4-connected pairwise MRFs. • Possible Reason (see Inference part): a lot of fast and good (globally optimal) inference methods exist
  • 26. Overview • Introduce Factor graphs notation • Categorization of models in Computer Vision? – 4-connected MRFs – Highly-connected MRFs – Higher-order MRFs
  • 27. Why larger connectivity? We have seen… • “Knock-on” effect (each pixel influences each other pixel) • Many good systems What is missing: 1. Modelling real-world texture (images) 2. Reduce discretization artefacts 3. Encode complex prior knowledge 4. Use non-local parameters
  • 28. Reason 1: Texture modelling Training images Test image Test image (60% Noise) Result MRF Result MRF Result MRF 4-connected 4-connected 9-connected (neighbours) (7 attractive; 2 repulsive)
  • 29. Reason1: Texture Modelling input output [Zalesny et al. ‘01+
  • 30. Reason2: Discretization artefacts 1 Length of the paths: Eucl. 4-con. 8-con. 5.65 6.28 5.08 8 6.28 6.75 Larger connectivity can model true Euclidean length (also other metric possible) *Boykov et al ‘03, ‘05+
  • 31. Reason2: Discretization artefacts 4-connected 8-connected 8-connected Euclidean Euclidean geodesic higher-connectivity can model true Euclidean length [Boykov et al. ‘03; ‘05+
  • 32. 3D reconstruction [Slide credits: Daniel Cremers]
  • 33. Reason 3: Encode complex prior knowledge: Stereo with occlusion E(d): {1,…,D}2n → R Each pixel is connected to D pixels in the other image d=1 (∞ cost) 1 1 d=10 (match) θlr (dl,dr) = d d match d=20 (0 cost) D D dl dr Left view right view
  • 34. Stereo with occlusion Ground truth Stereo with occlusion Stereo without occlusion [Kolmogrov et al. ‘02+ *Boykov et al. ‘01+
  • 35. Reason 4: Use Non-local parameters: Interactive Segmentation (GrabCut) [Boykov and Jolly ’01+ GrabCut *Rother et al. ’04+
  • 36. A meeting with the Queen
  • 37. Reason 4: Use Non-local parameters: Interactive Segmentation (GrabCut) w Model jointly segmentation and color model: E(x,w): {0,1}n x {GMMs}→ R E(x,w) = ∑ θi (xi,w) + ∑ θij (xi,xj) i i,j Є N4 An object is a compact set of colors: Red Red [Rother et al Siggraph ’04+
  • 38. Reason 4: Use Non-local parameters: Segmentation and Recognition Goal, Segment test image: 1 Large set of example segmentation: T(1) T(2) T(3) Up to 2.000.000 exemplars E(x,w): {0,1}n x {Exemplar}→ R E(x,w) = ∑ |T(w)i-xi| + ∑ θij (xi,xj) i i,j Є N4 “Hamming distance” [Lempisky et al. ECCV ’08+
  • 39. Reason 4: Use Non-local parameters: Segmentation and Recognition UIUC dataset; 98.8% accuracy [Lempisky et al. ECCV ’08+
  • 40. Overview • Introduce Factor graphs notation • Categorization of models in Computer Vision? – 4-connected MRFs – Highly-connected MRFs – Higher-order MRFs
  • 41. Why Higher-order Functions? In general θ(x1,x2,x3) ≠ θ(x1,x2) + θ(x1,x3) + θ(x2,x3) Reasons for higher-order MRFs: 1. Even better image(texture) models: – Field-of Expert [FoE, Roth et al. ‘05+ – Curvature *Woodford et al. ‘08+ 2. Use global Priors: – Connectivity *Vicente et al. ‘08, Nowizin et al. ‘09+ – Encode better training statistics *Woodford et al. ‘09+
  • 42. Reason1: Better Texture Modelling Higher Order Structure not Preserved Training images Test Image Test Image (60% Noise) Result pairwise MRF Higher-order MRF 9-connected *Rother et al CVPR ‘09+
  • 43. Reason 2: Use global Prior Foreground object must be connected: User input Standard MRF: with connectivity Removes noise (+) Shrinks boundary (-) E(x) = P(x) + h(x) with h(x)= { ∞ if not 4-connected 0 otherwise [Vicente et. al. ’08 Nowizin et al ‘09+
  • 44. Reason 2: Use global Prior What is the prior of a MAP-MRF solution: Training image: 60% black, 40% white MAP: Others less likely : 8 prior(x) = 0.6 = 0.016 prior(x) = 0.6 5 * 0.4 3 = 0.005 MRF is a bad prior since input marginal statistic ignored ! Introduce a global term, which controls global stats: Noisy input Ground truth Pairwise MRF – Global gradient prior Increase Prior strength [Woodford et. al. ICCV ‘09] (see poster on Friday)
  • 45. Summary • Introduce Factor graphs notation • Categorization of models in Computer Vision? – 4-connected MRFs – Highly-connected MRFs – Higher-order MRFs …. all useful models, but how do I optimize them?
  • 46. Course Program 9.30-10.00 Introduction (Andrew Blake) 10.00-11.00 Discrete Models in Computer Vision (Carsten Rother) 15min Coffee break 11.15-12.30 Message Passing: DP, TRW, LP relaxation (Pawan Kumar) 12.30-13.00 Quadratic pseudo-boolean optimization (Pushmeet Kohli) 1 hour Lunch break 14:00-15.00 Transformation and move-making methods (Pushmeet Kohli) 15:00-15.30 Speed and Efficiency (Pushmeet Kohli) 15min Coffee break 15:45-16.15 Comparison of Methods (Carsten Rother) 16:15-17.30 Recent Advances: Dual-decomposition, higher-order, etc. (Carsten Rother + Pawan Kumar) All online material will be online (after conference): http://research.microsoft.com/en-us/um/cambridge/projects/tutorial/
  • 47. END
  • 49. Markov Property xi • Markov Property: Each variable is only connected to a few others, i.e. many pixels are conditional independent • This makes inference easier (possible at all) • But still… every pixel can influence any other pixel (knock-on effect)
  • 50. Recap: Factor Graphs • Factor graphs: very good representation since it reflects directly the given energy • MRF (Markov Property) means many pixels are conditional independent • Still … all pixels influence each other (knock-on effect)
  • 51. Interactive Segmentation - Tutorial example Goal z = (R,G,B)n x = {0,1}n Given Z and unknown (latent) variables x: P(x|z) = P(z|x) P(x) / P(z) ~ P(z|x) P(x) Posterior Likelihood Prior Probability (data- (data- dependent) independent) Maximium a Posteriori (MAP): x* = argmax P(x|z) X
  • 52. Likelihood P(x|z) ~ P(z|x) P(x) Green Green Red Red
  • 53. Likelihood P(x|z) ~ P(z|x) P(x) Log P(zi|xi=0) Log P(zi|xi=1) Maximum likelihood: x* = argmax P(z|x) = X argmax ∏ P(zi|xi) x xi
  • 54. Prior P(x|z) ~ P(z|x) P(x) xi xj P(x) = 1/f ∏ θij (xi,xj) i,j Є N f = ∑ ∏ θij (xi,xj) “partition function” x i,j Є N θij (xi,xj) = exp{-|xi-xj|} “ising prior” (exp{-1}=0.36; exp{0}=1)
  • 55. Posterior distribution P(x|z) ~ P(z|x) P(x) Posterior “Gibbs” distribution: θi (xi,zi) = P(zi|xi=1) xi + P(zi|xi=0) (1-xi) Likelihood θ (x ,x ) = |x -x | prior ij i j i j P(x|z) = 1/f(z,w) exp{-E(x,z,w)} f(z,w) = ∑ exp{-E(x,z,w)} X E(x,z,w) = ∑ θi (xi,zi) + w∑ θij (xi,xj) Energy i i,j Unary terms Pairwise terms Note, likelihood can be an arbitrary function of the data
  • 56. Energy minization P(x|z) = 1/f(z,w) exp{-E(x,z,w)} -log P(x|z) = -log (1/f(z,w)) + E(x,z,w) f(z,w) = ∑ exp{-E(x,z,w)} X x* = argmin E(x,z,w) X MAP same as minimum Energy MAP; Global min E ML
  • 57. Weight prior and likelihood w =0 w =10 w =40 w =200 E(x,z,w) = ∑ θi (xi,zi) + w∑ θij (xi,xj)
  • 58. Moving away from a pure prior … ising cost Contrast Cost E(x,z,w) = ∑ θi (xi,zi) + w ∑ θij (xi,xj,zi,zj) i i,j θij (xi,xj,zi,zj) = |xi-xj| (-exp{-ß||zi-zj||2}) cost ß=2(Mean(||zi-zj||2) )-1 “Going from a Markov random Field to ||zi-zj||2 Conditional random field”
  • 59. Tree vs Loopy graphs Markov blanket of xi: all variables which are in same factor as xi [Felzenschwalb, Huttenlocher ‘01+ xi root - MAP (in general) NP hard tree chain (see inference part) - Marginals P(xi) also NP hard • MAP is tractable • Marginal, e.g. P(foot), tractable
  • 60. Stereo matching - prior θij (di,dj) = g(|di-dj|) cost Left image |di-dj| Potts model (Potts model) Smooth disparities [Olga Veksler PhD thesis]
  • 61. Modelling texture [Zalesny et al ‘01+ “Unary only” “8 connected MRF” input “13 connected MRF”
  • 62. Reason2: Discretization artefacts θij (xi,xj) = ∆a / (2*dis(xi,xj)) |xi –xj| √2 ∆a = π/4 1 1 Length of the path 4-con. 8-con. true euc. 6.28 5.08 5.65 6.28 6.75 8 Larger connectivity can model true Euclidean length (also any Riemannian metric, e.g. geodesic length, can be modelled) *Boykov et al ‘03, ‘05+
  • 63. References Higher-order Functions? • In general θ(x1,x2,x3) ≠ θ(x1,x2) + θ(x1,x3) + θ(x2,x3) Field of Experts Model (2x2; 5x5) *Roth, Black CVPR ‘05 + [Potetz, CVPR ‘07+ Minimize Curvature (3x1) *Woodford et al. CVPR ‘08 + Large Neighbourhood (10x10 -> whole image) *Rother, Kolmogorov, Minka & Blake, CVPR ‘06+ *Vicente, Kolmogorov, Rother, CVPR ‘08+ [Komodiakis, Paragios, CVPR ‘09+ *Rother, Kohli, Feng, Jia, CVPR ‘09+ *Woodford, Rother, Kolmogorov, ICCV ‘09+ *Vicente, Kolmogorov, Rother, ICCV ‘09+ *Ishikawa, CVPR ‘09+ *Ishikawa, ICCV ‘09+
  • 64. Conditional Random Field (CRF) E(x) = ∑ θi (xi,zi) + ∑ θij (xi,xj,zi,zj) θij i i,j Є N4 with θij (xi,xj,zi,zj) = |xi-xj| exp(-ß||zi-zj||2) ||zi-zj||2 zjj zii z xjj xji Ising cost Contrast Cost Factor graph Definition CRF: all factors may depend on the data z No problem for inference (but parameter learning)