Cvpr2007 object category recognition p3 - discriminative models

Part 3: classifier based methods Antonio Torralba

Overview of section ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Classifier based methods Object detection and recognition is formulated as a classification problem. … and a decision is taken at each window about if it contains a target object or not. Where are the screens? The image is partitioned into a set of overlapping windows Bag of image patches Decision boundary Computer screen Background In some feature space

Discriminative vs. generative x = data ,[object Object],(The lousy painter) 0 10 20 30 40 50 60 70 0 0.05 0.1 0 10 20 30 40 50 60 70 0 0.5 1 x = data ,[object Object],0 10 20 30 40 50 60 70 80 -1 1 x = data ,[object Object],( The artist )

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Face detection ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

[object Object],Formulation +1 -1 x 1 x 2 x 3 x N … … x N+1 x N+2 x N+M -1 -1 ? ? ? … Training data: each image patch is labeled as containing the object or background Test data Features x = Labels y = ,[object Object],[object Object],Where belongs to some family of functions ,[object Object]

Discriminative methods 10 6 examples Nearest neighbor Shakhnarovich, Viola, Darrell 2003 Berg, Berg, Malik 2005 … Neural networks LeCun, Bottou, Bengio, Haffner 1998 Rowley, Baluja, Kanade 1998 … Support Vector Machines and Kernels Conditional Random Fields McCallum, Freitag, Pereira 2000 Kumar, Hebert 2003 … Guyon, Vapnik Heisele, Serre, Poggio, 2001 …

A simple object detector with Boosting ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],http://people.csail.mit.edu/torralba/iccv2005/

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Why boosting?

Boosting ,[object Object],by minimizing the exponential loss Training samples The exponential loss is a differentiable upper bound to the misclassification error.

Boosting Sequential procedure. At each step we add For more details: Friedman, Hastie, Tibshirani. “Additive Logistic Regression: a Statistical View of Boosting” (1998) to minimize the residual loss input Desired output Parameters weak classifier

Weak classifiers ,[object Object],[object Object],Four parameters: b=E w (y [x>  ]) a=E w (y [x<  ]) x f m (x) 

Flavors of boosting ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

From images to features: A myriad of weak detectors ,[object Object],Takes image as input and the output is binary response. The output is a weak detector.

A myriad of weak detectors ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Weak detectors ,[object Object],[object Object],Every combination of three filters generates a different feature This gives thousands of features. Boosting selects a sparse subset, so computations on test time are very efficient. Boosting also avoids overfitting to some extend.

Haar wavelets ,[object Object],[object Object],The average intensity in the block is computed with four sums independently of the block size.

Haar wavelets Papageorgiou & Poggio (2000) Polynomial SVM

Edges and chamfer distance Gavrila, Philomin, ICCV 1999

Edge fragments Weak detector = k edge fragments and threshold. Chamfer distance uses 8 orientation planes Opelt, Pinz, Zisserman, ECCV 2006

Histograms of oriented gradients ,[object Object],[object Object],[object Object],[object Object]

Weak detectors ,[object Object],Car model Screen model These features are used for the detector on the course web site.

Weak detectors First we collect a set of part templates from a set of training objects. Vidal-Naquet, Ullman, Nature Neuroscience 2003 …

Weak detectors We now define a family of “weak detectors” as: = = Better than chance *

Weak detectors We can do a better job using filtered images Still a weak detector but better than before * * = = =

Training First we evaluate all the N features on all the training images. Then, we sample the feature outputs on the object center and at random locations in the background:

Representation and object model Selected features for the screen detector Lousy painter … 4 10 1 2 3 … 100

Representation and object model Selected features for the car detector 1 2 3 4 10 100 … …

Detection ,[object Object],[object Object],Here, invariance in translation and scale is achieved by the search strategy: the classifier is evaluated at all locations (by translating the image) and at all scales (by scaling the image in small steps). The search cost can be reduced using a cascade. f i , P i g i

Example: screen detection Feature output

Example: screen detection Feature output Thresholded output Weak ‘detector’ Produces many false alarms.

Example: screen detection Feature output Thresholded output Strong classifier at iteration 1

Example: screen detection Feature output Thresholded output Strong classifier Second weak ‘detector’ Produces a different set of false alarms.

Example: screen detection + Feature output Thresholded output Strong classifier Strong classifier at iteration 2

Example: screen detection + … Feature output Thresholded output Strong classifier Strong classifier at iteration 10

Example: screen detection + … Feature output Thresholded output Strong classifier Adding features Final classification Strong classifier at iteration 200

Cascade of classifiers Fleuret and Geman 2001, Viola and Jones 2001 We want the complexity of the 3 features classifier with the performance of the 100 features classifier: 3 features 30 features 100 features Select a threshold with high recall for each stage. We increase precision using the cascade Recall Precision 0% 100% 100%

Some goals for object recognition ,[object Object],[object Object],[object Object],[object Object],[object Object]

Multiclass object detection Number of classes Amount of labeled data Number of classes Complexity

Shared features ,[object Object],[object Object],[object Object],…

Multitask learning ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Primary task : detect door knobs Tasks used : R. Caruana. Multitask Learning. ML 1997

Sharing invariances S. Thrun. Is Learning the n-th Thing Any Easier Than Learning The First? NIPS 1996 “ Knowledge is transferred between tasks via a learned model of the invariances of the domain: object recognition is invariant to rotation, translation, scaling, lighting, … These invariances are common to all object recognition tasks”. Toy world Without sharing With sharing

Sharing transformations ,[object Object],Transformations are shared and can be learnt from other tasks.

Models of object recognition I. Biederman, “Recognition-by-components: A theory of human image understanding,” Psychological Review , 1987. M. Riesenhuber and T. Poggio, “Hierarchical models of object recognition in cortex,” Nature Neuroscience 1999. T. Serre, L. Wolf and T. Poggio. “Object recognition with features inspired by visual cortex”. CVPR 2005

Sharing in constellation models Pictorial Structures Fischler & Elschlager, IEEE Trans. Comp. 1973 Constellation Model Burl, Liung,Perona, 1996; Weber, Welling, Perona, 2000 Fergus, Perona, & Zisserman, CVPR 2003 SVM Detectors Heisele, Poggio, et. al., NIPS 2001 Model-Guided Segmentation Mori, Ren, Efros, & Malik, CVPR 2004

Random initialization Variational EM (Attias, Hinton, Beal, etc.) Slide from Fei Fei Li Fei-Fei , Fergus, & Perona, ICCV 2003 E-Step prior knowledge of p(  ) new estimate of p(  |train) M-Step new  ’s

Reusable Parts Goal: Look for a vocabulary of edges that reduces the number of features. Krempp, Geman, & Amit “Sequential Learning of Reusable Parts for Object Detection”. TR 2002 Number of features Number of classes Examples of reused parts

Sharing patches ,[object Object],For a new class, use only features similar to features that where good for other classes: Proposed Dog features

Multiclass boosting ,[object Object],[object Object],[object Object],[object Object]

Shared features Screen detector Car detector Face detector ,[object Object],Torralba, Murphy, Freeman. CVPR 2004. PAMI 2007 Screen detector Car detector Face detector ,[object Object]

50 training samples/class 29 object classes 2000 entries in the dictionary Results averaged on 20 runs Error bars = 80% interval Krempp, Geman, & Amit, 2002 Torralba, Murphy, Freeman. CVPR 2004 Shared features Class-specific features

Generalization as a function of object similarities Number of training samples per class Number of training samples per class Area under ROC Area under ROC K = 2.1 K = 4.8 Torralba, Murphy, Freeman. CVPR 2004. PAMI 2007 12 viewpoints 12 unrelated object classes

Opelt, Pinz, Zisserman, CVPR 2006 Efficiency Generalization

Some references on multiclass ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Context based methods Antonio Torralba

What are the hidden objects? 2 1

What are the hidden objects? Chance ~ 1/30000

Context-based object recognition ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Global and local representations building car sidewalk Urban street scene

Global and local representations Image index: Summary statistics, configuration of textures Urban street scene features histogram building car sidewalk Urban street scene

Global scene representations Spatial structure is important in order to provide context for object localization Sivic, Russell, Freeman, Zisserman, ICCV 2005 Fei-Fei and Perona, CVPR 2005 Bosch, Zisserman, Munoz, ECCV 2006 Bag of words Spatially organized textures Non localized textons S. Lazebnik, et al, CVPR 2006 Walker, Malik. Vision Research 2004 … M. Gorkani, R. Picard, ICPR 1994 A. Oliva, A. Torralba, IJCV 2001 …

Contextual object relationships Carbonetto, de Freitas & Barnard (2004) Kumar, Hebert (2005) Torralba Murphy Freeman (2004) Fink & Perona (2003) E. Sudderth et al (2005)

Context ,[object Object],[object Object],Keyboards O p1,c1 v p1,c1 O pN,c1 v pN,c1 . . . O p1,c2 v p1,c2 O pN,c2 v pN,c2 . . . Class 1 Class 2 E1 E2 S c2 max V c1 max V X1 X2 v g

3d Scene Context Image World [Hoiem, Efros, Hebert ICCV 2005]

3d Scene Context Image Support Vertical Sky V-Left V-Center V-Right V-Porous V-Solid [Hoiem, Efros, Hebert ICCV 2005] Object Surface? Support?

Object-Object Relationships ,[object Object],Carbonetto, de Freitas & Barnard (04)

Object-Object Relationships ,[object Object],He, Zemel & Carreira-Perpinan (04)

[object Object],[object Object],Object-Object Relationships

CRFs Object-Object Relationships [Torralba Murphy Freeman 2004] [Kumar Hebert 2005]

Hierarchical Sharing and Context ,[object Object],[object Object],[object Object],E. Sudderth, A. Torralba, W. T. Freeman, and A. Wilsky.

Some references on context ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],With a mixture of generative and discriminative approaches

Integrated models for scene and object recognition Banksy

Summary ,[object Object],[object Object],[object Object],[object Object],[object Object]

Cvpr2007 object category recognition p3 - discriminative models

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (6)

Similar a Cvpr2007 object category recognition p3 - discriminative models

Similar a Cvpr2007 object category recognition p3 - discriminative models (20)

Más de zukun

Más de zukun (20)

Último

Último (20)

Cvpr2007 object category recognition p3 - discriminative models

Notas del editor