SlideShare una empresa de Scribd logo
1 de 64
Computer vision:
models, learning and inference
                Chapter 8
                Regression



       Please send errata to s.prince@cs.ucl.ac.uk
Structure
•   Linear regression
•   Bayesian solution
•   Non-linear regression
•   Kernelization and Gaussian processes
•   Sparse linear regression
•   Dual linear regression
•   Relevance vector regression
•   Applications

          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   2
Models for machine vision




  Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   3
Body Pose Regression




Encode silhouette as 100x1 vector, encode body pose as 55 x1
                 vector. Learn relationship

  Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   4
Type 1: Model Pr(w|x) -
                Discriminative
How to model Pr(w|x)?
  – Choose an appropriate form for Pr(w)
  – Make parameters a function of x
  – Function takes parameters q that define its shape

Learning algorithm: learn parameters q from training data x,w
Inference algorithm: just evaluate Pr(w|x)




            Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   5
Linear Regression
•   For simplicity we will assume that each dimension of
    world is predicted separately.
•   Concentrate on predicting a univariate world state w.


Choose normal distribution over world w



Make
  • Mean a linear function of data x
  • Variance constant


           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   6
Linear Regression




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   7
Neater Notation


To make notation easier to handle, we
   • Attach a 1 to the start of every data vector


   • Attach the offset to the start of the gradient vector f



New model:


          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   8
Combining Equations
We have on equation for each x,w pair:




The likelihood of the whole dataset is be the product of these
individual distributions and can be written as




where


          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   9
Learning
Maximum likelihood




Substituting in



Take derivative, set result to zero and re-arrange:




          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   10
Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   11
Regression Models




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   12
Structure
•   Linear regression
•   Bayesian solution
•   Non-linear regression
•   Kernelization and Gaussian processes
•   Sparse linear regression
•   Dual linear regression
•   Relevance vector regression
•   Applications

          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   13
Bayesian Regression
(We concentrate on f – come back to s2 later!)

Likelihood



Prior



Bayes rule’



         Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   14
Posterior Dist. over Parameters

  where




     Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   15
Inference




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   16
Fitting Variance
• We’ll fit the variance with maximum likelihood
• Optimize the marginal likelihood (likelihood
  after gradients have been integrated out)




         Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   17
Practical Issue
Problem: In high dimensions, the matrix A may be too big to invert



Solution: Re-express using Matrix Inversion Lemma



Final expression: inverses are (I x I) , not (D x D)




             Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   18
Structure
•   Linear regression
•   Bayesian solution
•   Non-linear regression
•   Kernelization and Gaussian processes
•   Sparse linear regression
•   Dual linear regression
•   Relevance vector regression
•   Applications

          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   19
Regression Models




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   20
Non-Linear Regression
GOAL:

Keep the maths of linear regression, but extend to
more general functions

KEY IDEA:

You can make a non-linear function from a linear
weighted sum of non-linear basis functions




        Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   21
Non-linear regression
Linear regression:


Non-Linear regression:


where
In other words create z by evaluating x against basis
functions, then linearly regress against z.

          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   22
Example: polynomial regression


A special case of


Where




        Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   23
Radial basis functions




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   24
Arc Tan Functions




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   25
Non-linear regression
Linear regression:


Non-Linear regression:


where
In other words create z by evaluating x against basis
functions, then linearly regress against z.

          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   26
Non-linear regression
Linear regression:


Non-Linear regression:


where

In other words create z by evaluating against basis
functions,Computer vision: models, learning and inference. ©2011 Simon J.D. Prince
           then linearly regress against z.                                          27
Maximum Likelihood

Same as linear regression, but substitute in Z for X:




          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   28
Structure
•   Linear regression
•   Bayesian solution
•   Non-linear regression
•   Kernelization and Gaussian processes
•   Sparse linear regression
•   Dual linear regression
•   Relevance vector regression
•   Applications

          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   29
Regression Models




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   30
Bayesian Approach
Learn s2 from marginal likelihood as before

Final predictive distribution:




              Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   31
Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   32
The Kernel Trick
Notice that the final equation doesn’t need the
 data itself, but just dot products between data
 items of the form ziTzj




So, we take data xi and xj pass through non-linear function to create
zi and zj and then take dot products of different ziTzj
           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   33
The Kernel Trick
So, we take data xi and xj pass through non-linear function to
create zi and zj and then take dot products of different ziTzj

Key idea:

Define a “kernel” function that does all of this together.
   • Takes data xi and xj
   • Returns a value for dot product ziTzj

If we choose this function carefully, then it will corresponds to some
    underlying z=f[x].
Never compute z explicitly - can be very high or infinite dimension
            Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   34
Gaussian Process Regression
Before




After




            Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   35
Example Kernels




(Equivalent to having an infinite number of radial basis functions at
every position in space. Wow!)
            Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   36
RBF Kernel Fits




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   37
Fitting Variance
• We’ll fit the variance with maximum likelihood
• Optimize the marginal likelihood (likelihood after
  gradients have been integrated out)




• Have to use non-linear optimization


          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   38
Structure
•   Linear regression
•   Bayesian solution
•   Non-linear regression
•   Kernelization and Gaussian processes
•   Sparse linear regression
•   Dual linear regression
•   Relevance vector regression
•   Applications

          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   39
Regression Models




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   40
Sparse Linear Regression
Perhaps not every dimension of the data x is informative

A sparse solution forces some of the coefficients in f to be zero

Method:

– apply a different prior on f that
       encourages sparsity

– product of t-distributions



              Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   41
Sparse Linear Regression
Apply product of t-distributions to parameter vector




As before, we use




Now the prior is not conjugate to the normal likelihood. Cannot
compute posterior in closed from
             Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   42
Sparse Linear Regression
To make progress, write as marginal of joint distribution




                  Diagonal matrix with hidden variables {hd} on diagonal




             Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   43
Sparse Linear Regression
Substituting in the prior




Still cannot compute, but can approximate


             Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   44
Sparse Linear Regression


To fit the model, update variance s2 and hidden variables {hd}.
• To choose hidden variables


•   To choose variance



where


             Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   45
Sparse Linear Regression




After fitting, some of hidden variables become very big, implies prior tightly
fitted around zero, can be eliminated from model

            Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   46
Sparse Linear Regression




Doesn’t work for non-linear case as we need one hidden variable per
dimension – becomes intractable with high dimensional transformation. To
solve this problem, we move to the dual model.
            Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   47
Structure
•   Linear regression
•   Bayesian solution
•   Non-linear regression
•   Kernelization and Gaussian processes
•   Sparse linear regression
•   Dual linear regression
•   Relevance vector regression
•   Applications

          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   48
Dual Linear Regression
KEY IDEA:

Gradient F is just a vector in the
data space

Can represent as a weighted
sum of the data points



Now solve for Y. One
parameter per training example.

              Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   49
Dual Linear Regression
Original linear regression:



Dual variables:



Dual linear regression:




             Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   50
Maximum likelihood
Maximum likelihood solution:



Dual variables:




Same result as before:




            Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   51
Bayesian case


Compute distribution over parameters:



Gives result:



where


            Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   52
Bayesian case
Predictive distribution:




where:



 Notice that in both the maximum likelihood and Bayesian case
 depend on dot products XTX. Can be kernelized!

             Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   53
Structure
•   Linear regression
•   Bayesian solution
•   Non-linear regression
•   Kernelization and Gaussian processes
•   Sparse linear regression
•   Dual linear regression
•   Relevance vector regression
•   Applications

          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   54
Regression Models




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   55
Relevance Vector Machine

                                                       Combines ideas of
                                                              • Dual regression (1
                                                                parameter per training
                                                                example)
                                                              • Sparsity (most of the
                                                                parameters are zero)
                                                       i.e., model that only depends
                                                           sparsely on training data.




  Computer vision: models, learning and inference. ©2011 Simon J.D. Prince          56
Relevance Vector Machine



Using same approximations as for sparse model we get the
problem:



To solve, update variance s2 and hidden variables {hd} alternately.
Notice that this only depends on dot-products and so can be
kernelized
           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   57
Structure
•   Linear regression
•   Bayesian solution
•   Non-linear regression
•   Kernelization and Gaussian processes
•   Sparse linear regression
•   Dual linear regression
•   Relevance vector regression
•   Applications

          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   58
Body Pose Regression
     (Agarwal and Triggs 2006)



                                                      Encode silhouette as
                                                      100x1 vector, encode
                                                      body pose as 55 x1
                                                      vector. Learn
                                                      relationship




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince     59
Shape Context




Returns 60 x 1 vector for each of 400 points around the silhouette
           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   60
Dimensionality Reduction




Cluster 60D space (based on all training data) into 100 vectors
Assign each 60x1 vector to closest cluster (Voronoi partition)
Final data vector is 100x1 histogram over distribution of assignments
            Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   61
Results




• 2636 training examples, solution depends on only 6% of these
• 6 degree average errorlearning and inference. ©2011 Simon J.D. Prince
           Computer vision: models,                                     62
Displacement experts




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   63
Regression
• Not actually used much in vision
• But main ideas all apply to classification:
  – Non-linear transformations
  – Kernelization
  – Dual parameters
  – Sparse priors




         Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   64

Más contenido relacionado

Destacado

Skiena algorithm 2007 lecture20 satisfiability
Skiena algorithm 2007 lecture20 satisfiabilitySkiena algorithm 2007 lecture20 satisfiability
Skiena algorithm 2007 lecture20 satisfiability
zukun
 
Lecture24
Lecture24Lecture24
Lecture24
zukun
 
Lecture14
Lecture14Lecture14
Lecture14
zukun
 
16 cv mil_multiple_cameras
16 cv mil_multiple_cameras16 cv mil_multiple_cameras
16 cv mil_multiple_cameras
zukun
 
18 cv mil_style_and_identity
18 cv mil_style_and_identity18 cv mil_style_and_identity
18 cv mil_style_and_identity
zukun
 
Mit6870 orsu lecture12
Mit6870 orsu lecture12Mit6870 orsu lecture12
Mit6870 orsu lecture12
zukun
 
12 cv mil_models_for_grids
12 cv mil_models_for_grids12 cv mil_models_for_grids
12 cv mil_models_for_grids
zukun
 
Lecture27
Lecture27Lecture27
Lecture27
zukun
 

Destacado (8)

Skiena algorithm 2007 lecture20 satisfiability
Skiena algorithm 2007 lecture20 satisfiabilitySkiena algorithm 2007 lecture20 satisfiability
Skiena algorithm 2007 lecture20 satisfiability
 
Lecture24
Lecture24Lecture24
Lecture24
 
Lecture14
Lecture14Lecture14
Lecture14
 
16 cv mil_multiple_cameras
16 cv mil_multiple_cameras16 cv mil_multiple_cameras
16 cv mil_multiple_cameras
 
18 cv mil_style_and_identity
18 cv mil_style_and_identity18 cv mil_style_and_identity
18 cv mil_style_and_identity
 
Mit6870 orsu lecture12
Mit6870 orsu lecture12Mit6870 orsu lecture12
Mit6870 orsu lecture12
 
12 cv mil_models_for_grids
12 cv mil_models_for_grids12 cv mil_models_for_grids
12 cv mil_models_for_grids
 
Lecture27
Lecture27Lecture27
Lecture27
 

Similar a 08 cv mil_regression

07 cv mil_modeling_complex_densities
07 cv mil_modeling_complex_densities07 cv mil_modeling_complex_densities
07 cv mil_modeling_complex_densities
zukun
 
04 cv mil_fitting_probability_models
04 cv mil_fitting_probability_models04 cv mil_fitting_probability_models
04 cv mil_fitting_probability_models
zukun
 
13 cv mil_preprocessing
13 cv mil_preprocessing13 cv mil_preprocessing
13 cv mil_preprocessing
zukun
 
06 cv mil_learning_and_inference
06 cv mil_learning_and_inference06 cv mil_learning_and_inference
06 cv mil_learning_and_inference
zukun
 
02 cv mil_intro_to_probability
02 cv mil_intro_to_probability02 cv mil_intro_to_probability
02 cv mil_intro_to_probability
zukun
 
05 cv mil_normal_distribution
05 cv mil_normal_distribution05 cv mil_normal_distribution
05 cv mil_normal_distribution
zukun
 
19 cv mil_temporal_models
19 cv mil_temporal_models19 cv mil_temporal_models
19 cv mil_temporal_models
zukun
 

Similar a 08 cv mil_regression (20)

07 cv mil_modeling_complex_densities
07 cv mil_modeling_complex_densities07 cv mil_modeling_complex_densities
07 cv mil_modeling_complex_densities
 
04 cv mil_fitting_probability_models
04 cv mil_fitting_probability_models04 cv mil_fitting_probability_models
04 cv mil_fitting_probability_models
 
13 cv mil_preprocessing
13 cv mil_preprocessing13 cv mil_preprocessing
13 cv mil_preprocessing
 
Graphical Models for chains, trees and grids
Graphical Models for chains, trees and gridsGraphical Models for chains, trees and grids
Graphical Models for chains, trees and grids
 
06 cv mil_learning_and_inference
06 cv mil_learning_and_inference06 cv mil_learning_and_inference
06 cv mil_learning_and_inference
 
02 cv mil_intro_to_probability
02 cv mil_intro_to_probability02 cv mil_intro_to_probability
02 cv mil_intro_to_probability
 
Introduction to Probability
Introduction to ProbabilityIntroduction to Probability
Introduction to Probability
 
Common Probability Distibution
Common Probability DistibutionCommon Probability Distibution
Common Probability Distibution
 
05 cv mil_normal_distribution
05 cv mil_normal_distribution05 cv mil_normal_distribution
05 cv mil_normal_distribution
 
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
Deep Learning Opening Workshop - Improving Generative Models - Junier Oliva, ...
 
19 cv mil_temporal_models
19 cv mil_temporal_models19 cv mil_temporal_models
19 cv mil_temporal_models
 
Linear models for data science
Linear models for data scienceLinear models for data science
Linear models for data science
 
LP.ppt
LP.pptLP.ppt
LP.ppt
 
Methods of Manifold Learning for Dimension Reduction of Large Data Sets
Methods of Manifold Learning for Dimension Reduction of Large Data SetsMethods of Manifold Learning for Dimension Reduction of Large Data Sets
Methods of Manifold Learning for Dimension Reduction of Large Data Sets
 
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
Unsupervised Learning (D2L6 2017 UPC Deep Learning for Computer Vision)
 
EPFL_presentation
EPFL_presentationEPFL_presentation
EPFL_presentation
 
Large Scale Image Retrieval 2022.pdf
Large Scale Image Retrieval 2022.pdfLarge Scale Image Retrieval 2022.pdf
Large Scale Image Retrieval 2022.pdf
 
Derivative Free Optimization and Robust Optimization
Derivative Free Optimization and Robust OptimizationDerivative Free Optimization and Robust Optimization
Derivative Free Optimization and Robust Optimization
 
The John Henry lens design challenge
The John Henry lens design challengeThe John Henry lens design challenge
The John Henry lens design challenge
 
Variational Autoencoders For Image Generation
Variational Autoencoders For Image GenerationVariational Autoencoders For Image Generation
Variational Autoencoders For Image Generation
 

Más de zukun

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
zukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
zukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
zukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
zukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
zukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
zukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
zukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
zukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
zukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
zukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
zukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
zukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
zukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
zukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
zukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
zukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
zukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
zukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
zukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
zukun
 

Más de zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 

08 cv mil_regression

  • 1. Computer vision: models, learning and inference Chapter 8 Regression Please send errata to s.prince@cs.ucl.ac.uk
  • 2. Structure • Linear regression • Bayesian solution • Non-linear regression • Kernelization and Gaussian processes • Sparse linear regression • Dual linear regression • Relevance vector regression • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 2
  • 3. Models for machine vision Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 3
  • 4. Body Pose Regression Encode silhouette as 100x1 vector, encode body pose as 55 x1 vector. Learn relationship Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 4
  • 5. Type 1: Model Pr(w|x) - Discriminative How to model Pr(w|x)? – Choose an appropriate form for Pr(w) – Make parameters a function of x – Function takes parameters q that define its shape Learning algorithm: learn parameters q from training data x,w Inference algorithm: just evaluate Pr(w|x) Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 5
  • 6. Linear Regression • For simplicity we will assume that each dimension of world is predicted separately. • Concentrate on predicting a univariate world state w. Choose normal distribution over world w Make • Mean a linear function of data x • Variance constant Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 6
  • 7. Linear Regression Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 7
  • 8. Neater Notation To make notation easier to handle, we • Attach a 1 to the start of every data vector • Attach the offset to the start of the gradient vector f New model: Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 8
  • 9. Combining Equations We have on equation for each x,w pair: The likelihood of the whole dataset is be the product of these individual distributions and can be written as where Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 9
  • 10. Learning Maximum likelihood Substituting in Take derivative, set result to zero and re-arrange: Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 10
  • 11. Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 11
  • 12. Regression Models Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 12
  • 13. Structure • Linear regression • Bayesian solution • Non-linear regression • Kernelization and Gaussian processes • Sparse linear regression • Dual linear regression • Relevance vector regression • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 13
  • 14. Bayesian Regression (We concentrate on f – come back to s2 later!) Likelihood Prior Bayes rule’ Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 14
  • 15. Posterior Dist. over Parameters where Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 15
  • 16. Inference Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 16
  • 17. Fitting Variance • We’ll fit the variance with maximum likelihood • Optimize the marginal likelihood (likelihood after gradients have been integrated out) Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 17
  • 18. Practical Issue Problem: In high dimensions, the matrix A may be too big to invert Solution: Re-express using Matrix Inversion Lemma Final expression: inverses are (I x I) , not (D x D) Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 18
  • 19. Structure • Linear regression • Bayesian solution • Non-linear regression • Kernelization and Gaussian processes • Sparse linear regression • Dual linear regression • Relevance vector regression • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 19
  • 20. Regression Models Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 20
  • 21. Non-Linear Regression GOAL: Keep the maths of linear regression, but extend to more general functions KEY IDEA: You can make a non-linear function from a linear weighted sum of non-linear basis functions Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 21
  • 22. Non-linear regression Linear regression: Non-Linear regression: where In other words create z by evaluating x against basis functions, then linearly regress against z. Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 22
  • 23. Example: polynomial regression A special case of Where Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 23
  • 24. Radial basis functions Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 24
  • 25. Arc Tan Functions Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 25
  • 26. Non-linear regression Linear regression: Non-Linear regression: where In other words create z by evaluating x against basis functions, then linearly regress against z. Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 26
  • 27. Non-linear regression Linear regression: Non-Linear regression: where In other words create z by evaluating against basis functions,Computer vision: models, learning and inference. ©2011 Simon J.D. Prince then linearly regress against z. 27
  • 28. Maximum Likelihood Same as linear regression, but substitute in Z for X: Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 28
  • 29. Structure • Linear regression • Bayesian solution • Non-linear regression • Kernelization and Gaussian processes • Sparse linear regression • Dual linear regression • Relevance vector regression • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 29
  • 30. Regression Models Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 30
  • 31. Bayesian Approach Learn s2 from marginal likelihood as before Final predictive distribution: Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 31
  • 32. Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 32
  • 33. The Kernel Trick Notice that the final equation doesn’t need the data itself, but just dot products between data items of the form ziTzj So, we take data xi and xj pass through non-linear function to create zi and zj and then take dot products of different ziTzj Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 33
  • 34. The Kernel Trick So, we take data xi and xj pass through non-linear function to create zi and zj and then take dot products of different ziTzj Key idea: Define a “kernel” function that does all of this together. • Takes data xi and xj • Returns a value for dot product ziTzj If we choose this function carefully, then it will corresponds to some underlying z=f[x]. Never compute z explicitly - can be very high or infinite dimension Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 34
  • 35. Gaussian Process Regression Before After Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 35
  • 36. Example Kernels (Equivalent to having an infinite number of radial basis functions at every position in space. Wow!) Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 36
  • 37. RBF Kernel Fits Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 37
  • 38. Fitting Variance • We’ll fit the variance with maximum likelihood • Optimize the marginal likelihood (likelihood after gradients have been integrated out) • Have to use non-linear optimization Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 38
  • 39. Structure • Linear regression • Bayesian solution • Non-linear regression • Kernelization and Gaussian processes • Sparse linear regression • Dual linear regression • Relevance vector regression • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 39
  • 40. Regression Models Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 40
  • 41. Sparse Linear Regression Perhaps not every dimension of the data x is informative A sparse solution forces some of the coefficients in f to be zero Method: – apply a different prior on f that encourages sparsity – product of t-distributions Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 41
  • 42. Sparse Linear Regression Apply product of t-distributions to parameter vector As before, we use Now the prior is not conjugate to the normal likelihood. Cannot compute posterior in closed from Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 42
  • 43. Sparse Linear Regression To make progress, write as marginal of joint distribution Diagonal matrix with hidden variables {hd} on diagonal Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 43
  • 44. Sparse Linear Regression Substituting in the prior Still cannot compute, but can approximate Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 44
  • 45. Sparse Linear Regression To fit the model, update variance s2 and hidden variables {hd}. • To choose hidden variables • To choose variance where Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 45
  • 46. Sparse Linear Regression After fitting, some of hidden variables become very big, implies prior tightly fitted around zero, can be eliminated from model Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 46
  • 47. Sparse Linear Regression Doesn’t work for non-linear case as we need one hidden variable per dimension – becomes intractable with high dimensional transformation. To solve this problem, we move to the dual model. Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 47
  • 48. Structure • Linear regression • Bayesian solution • Non-linear regression • Kernelization and Gaussian processes • Sparse linear regression • Dual linear regression • Relevance vector regression • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 48
  • 49. Dual Linear Regression KEY IDEA: Gradient F is just a vector in the data space Can represent as a weighted sum of the data points Now solve for Y. One parameter per training example. Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 49
  • 50. Dual Linear Regression Original linear regression: Dual variables: Dual linear regression: Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 50
  • 51. Maximum likelihood Maximum likelihood solution: Dual variables: Same result as before: Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 51
  • 52. Bayesian case Compute distribution over parameters: Gives result: where Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 52
  • 53. Bayesian case Predictive distribution: where: Notice that in both the maximum likelihood and Bayesian case depend on dot products XTX. Can be kernelized! Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 53
  • 54. Structure • Linear regression • Bayesian solution • Non-linear regression • Kernelization and Gaussian processes • Sparse linear regression • Dual linear regression • Relevance vector regression • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 54
  • 55. Regression Models Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 55
  • 56. Relevance Vector Machine Combines ideas of • Dual regression (1 parameter per training example) • Sparsity (most of the parameters are zero) i.e., model that only depends sparsely on training data. Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 56
  • 57. Relevance Vector Machine Using same approximations as for sparse model we get the problem: To solve, update variance s2 and hidden variables {hd} alternately. Notice that this only depends on dot-products and so can be kernelized Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 57
  • 58. Structure • Linear regression • Bayesian solution • Non-linear regression • Kernelization and Gaussian processes • Sparse linear regression • Dual linear regression • Relevance vector regression • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 58
  • 59. Body Pose Regression (Agarwal and Triggs 2006) Encode silhouette as 100x1 vector, encode body pose as 55 x1 vector. Learn relationship Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 59
  • 60. Shape Context Returns 60 x 1 vector for each of 400 points around the silhouette Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 60
  • 61. Dimensionality Reduction Cluster 60D space (based on all training data) into 100 vectors Assign each 60x1 vector to closest cluster (Voronoi partition) Final data vector is 100x1 histogram over distribution of assignments Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 61
  • 62. Results • 2636 training examples, solution depends on only 6% of these • 6 degree average errorlearning and inference. ©2011 Simon J.D. Prince Computer vision: models, 62
  • 63. Displacement experts Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 63
  • 64. Regression • Not actually used much in vision • But main ideas all apply to classification: – Non-linear transformations – Kernelization – Dual parameters – Sparse priors Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 64