SlideShare una empresa de Scribd logo
1 de 82
Part 3: classifier based methods Antonio Torralba
Overview of section ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Classifier based methods Object detection and recognition is formulated as a classification problem.  …  and a decision is taken at each window about if it contains a target object or not. Where are the screens? The image is partitioned into a set of overlapping windows Bag of image patches Decision boundary Computer screen Background In some feature space
Discriminative vs. generative x = data ,[object Object],(The lousy  painter) 0 10 20 30 40 50 60 70 0 0.05 0.1 0 10 20 30 40 50 60 70 0 0.5 1 x = data ,[object Object],0 10 20 30 40 50 60 70 80 -1 1 x = data ,[object Object],( The artist )
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Face detection ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Face detection
[object Object],Formulation +1 -1 x 1 x 2 x 3 x N … … x N+1 x N+2 x N+M -1 -1 ? ? ? … Training data: each image patch is labeled as containing the object or background Test data Features  x = Labels y = ,[object Object],[object Object],Where  belongs to some family of functions ,[object Object]
Discriminative methods 10 6  examples Nearest neighbor Shakhnarovich, Viola, Darrell 2003 Berg, Berg, Malik 2005 … Neural networks LeCun, Bottou, Bengio, Haffner 1998 Rowley, Baluja, Kanade 1998 … Support Vector Machines and Kernels Conditional Random Fields McCallum, Freitag, Pereira 2000 Kumar, Hebert 2003 … Guyon, Vapnik Heisele, Serre, Poggio, 2001 …
A simple object detector with Boosting  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],http://people.csail.mit.edu/torralba/iccv2005/
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Why boosting?
Boosting  ,[object Object],by minimizing the exponential loss Training samples The exponential loss is a differentiable upper bound to the misclassification error.
Boosting  Sequential procedure. At each step we add For more details: Friedman, Hastie, Tibshirani. “Additive Logistic Regression: a Statistical View of Boosting” (1998) to minimize the residual loss  input Desired output Parameters weak classifier
Weak classifiers  ,[object Object],[object Object],Four parameters: b=E w (y [x>   ]) a=E w (y [x<   ]) x f m (x) 
Flavors of boosting ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
From images to features: A myriad of weak detectors ,[object Object],Takes image as input and the output is binary response. The output is a weak detector.
A myriad of weak detectors ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Weak detectors ,[object Object],[object Object],Every combination of three filters generates a different feature This gives thousands of features. Boosting selects a sparse subset, so computations on test time are very efficient. Boosting also avoids overfitting to some extend.
Haar wavelets ,[object Object],[object Object],The average intensity in the block is computed with four sums independently of the block size.
Haar wavelets Papageorgiou & Poggio (2000) Polynomial SVM
Edges and chamfer distance Gavrila, Philomin, ICCV 1999
Edge fragments Weak detector = k edge fragments and threshold. Chamfer distance uses 8 orientation planes Opelt, Pinz, Zisserman, ECCV 2006
Histograms of oriented gradients ,[object Object],[object Object],[object Object],[object Object]
Weak detectors ,[object Object],Car model Screen model These features are used for the detector on the course web site.
Weak detectors First we collect a set of part templates from a set of training objects. Vidal-Naquet, Ullman, Nature Neuroscience 2003 …
Weak detectors We now define a family of “weak detectors” as: = = Better than chance *
Weak detectors We can do a better job using filtered images Still a weak detector but better than before * * = = =
Training First we evaluate all the N features on all the training images. Then, we sample the feature outputs on the object center and at random locations in the background:
Representation and object model Selected features for the screen detector Lousy painter  … 4 10 1 2 3 … 100
Representation and object model Selected features for the car detector 1 2 3 4 10 100 … …
Detection ,[object Object],[object Object],Here, invariance in translation and scale is achieved by the search strategy: the classifier is evaluated at all locations (by translating the image) and at all scales (by scaling the image in small steps). The search cost can be reduced using a cascade. f i , P i g i
Example: screen detection Feature  output
Example: screen detection Feature  output Thresholded  output Weak ‘detector’ Produces many false alarms.
Example: screen detection Feature  output Thresholded  output Strong classifier  at iteration 1
Example: screen detection Feature  output Thresholded  output Strong classifier Second weak ‘detector’ Produces a different set of false alarms.
Example: screen detection + Feature  output Thresholded  output Strong classifier Strong classifier  at iteration 2
Example: screen detection + … Feature  output Thresholded  output Strong classifier Strong classifier  at iteration 10
Example: screen detection + … Feature  output Thresholded  output Strong classifier Adding  features Final classification Strong classifier  at iteration 200
Cascade of classifiers Fleuret and Geman 2001, Viola and Jones 2001 We want the complexity of the 3 features classifier with the performance of the 100 features classifier: 3 features 30 features 100 features Select a threshold with high recall for each stage.  We increase precision using the cascade Recall Precision 0% 100% 100%
Some goals for object recognition  ,[object Object],[object Object],[object Object],[object Object],[object Object]
Multiclass object detection
Multiclass object detection Number of classes Amount of labeled data Number of classes Complexity
Shared features ,[object Object],[object Object],[object Object],…
Multitask learning ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Primary task : detect door knobs Tasks used : R. Caruana. Multitask Learning. ML 1997
Sharing invariances S. Thrun.   Is Learning the n-th Thing Any Easier Than Learning The First? NIPS 1996 “ Knowledge is transferred between tasks via a learned model of the invariances of the domain: object recognition is invariant to rotation, translation, scaling, lighting, … These invariances are common to all object recognition tasks”.  Toy world Without sharing With sharing
Sharing transformations ,[object Object],Transformations are shared and can be learnt from other tasks.
Models of object recognition I. Biederman, “Recognition-by-components: A theory of human image understanding,”  Psychological Review , 1987. M. Riesenhuber and T. Poggio, “Hierarchical models of object recognition in cortex,”  Nature Neuroscience  1999. T. Serre, L. Wolf and T. Poggio. “Object recognition with features inspired  by visual cortex”. CVPR 2005
Sharing in constellation models Pictorial Structures Fischler & Elschlager, IEEE Trans. Comp. 1973 Constellation Model Burl, Liung,Perona, 1996; Weber, Welling, Perona, 2000 Fergus, Perona, & Zisserman, CVPR 2003  SVM Detectors Heisele, Poggio, et. al., NIPS 2001 Model-Guided Segmentation Mori, Ren, Efros, & Malik, CVPR 2004
Random  initialization Variational EM (Attias, Hinton, Beal, etc.) Slide from Fei Fei Li Fei-Fei ,  Fergus, & Perona, ICCV 2003 E-Step prior knowledge of p(  ) new estimate  of p(  |train) M-Step new   ’s
Reusable Parts Goal: Look for a vocabulary of edges that reduces the number of features. Krempp, Geman, & Amit “Sequential Learning of Reusable Parts for Object Detection”. TR 2002 Number of features Number of classes Examples of reused parts
Sharing patches ,[object Object],For a new class, use only features similar to features that where good for other classes: Proposed Dog  features
Multiclass boosting ,[object Object],[object Object],[object Object],[object Object]
Shared features Screen detector Car detector Face detector ,[object Object],Torralba, Murphy, Freeman. CVPR 2004. PAMI 2007 Screen detector Car detector Face detector ,[object Object]
50 training samples/class 29 object classes 2000 entries in the dictionary Results averaged on 20 runs Error bars = 80% interval Krempp, Geman, & Amit, 2002 Torralba, Murphy, Freeman. CVPR 2004  Shared features Class-specific features
Generalization as a function of object similarities Number of training samples per class Number of training samples per class Area under ROC Area under ROC K = 2.1 K = 4.8 Torralba, Murphy, Freeman. CVPR 2004. PAMI 2007 12 viewpoints 12 unrelated object classes
Opelt, Pinz, Zisserman, CVPR 2006  Efficiency Generalization
Some references on multiclass ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Context based methods Antonio Torralba
Why is this hard?
What are the hidden objects? 2 1
What are the hidden objects? Chance ~ 1/30000
Context-based object recognition ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
 
 
 
Global and local representations building car sidewalk Urban street scene
Global and local representations Image index: Summary statistics,  configuration of textures Urban street scene features histogram building car sidewalk Urban street scene
Global scene representations Spatial structure is important in order to provide context for object localization Sivic, Russell, Freeman, Zisserman, ICCV 2005 Fei-Fei and Perona, CVPR 2005 Bosch, Zisserman, Munoz, ECCV 2006 Bag of words Spatially organized textures Non localized textons S. Lazebnik, et al, CVPR 2006 Walker, Malik. Vision Research 2004  … M. Gorkani, R. Picard, ICPR 1994 A. Oliva, A. Torralba, IJCV 2001 …
Contextual object relationships Carbonetto, de Freitas & Barnard (2004) Kumar, Hebert (2005) Torralba Murphy Freeman (2004) Fink & Perona (2003) E. Sudderth et al (2005)
Context ,[object Object],[object Object],Keyboards O p1,c1 v p1,c1 O pN,c1 v pN,c1 . . . O p1,c2 v p1,c2 O pN,c2 v pN,c2 . . . Class 1 Class 2 E1 E2 S c2 max V c1 max V X1 X2 v g
3d Scene Context Image World [Hoiem, Efros, Hebert ICCV 2005]
3d Scene Context Image Support Vertical Sky V-Left V-Center V-Right V-Porous V-Solid [Hoiem, Efros, Hebert ICCV 2005] Object Surface? Support?
Object-Object Relationships ,[object Object],Carbonetto, de Freitas & Barnard (04)
Object-Object Relationships ,[object Object],He, Zemel & Carreira-Perpinan (04)
[object Object],[object Object],Object-Object Relationships
CRFs Object-Object Relationships [Torralba Murphy Freeman 2004] [Kumar Hebert 2005]
Hierarchical Sharing and Context ,[object Object],[object Object],[object Object],E. Sudderth, A. Torralba, W. T. Freeman, and A. Wilsky.
Some references on context ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],With a mixture of generative and discriminative approaches
A car out of context …
Integrated models for  scene and object recognition Banksy
Summary ,[object Object],[object Object],[object Object],[object Object],[object Object]

Más contenido relacionado

La actualidad más candente

Scrdet++ analysis
Scrdet++ analysisScrdet++ analysis
Scrdet++ analysisNEHA Kapoor
 
Estimating Human Pose from Occluded Images (ACCV 2009)
Estimating Human Pose from Occluded Images (ACCV 2009)Estimating Human Pose from Occluded Images (ACCV 2009)
Estimating Human Pose from Occluded Images (ACCV 2009)Jia-Bin Huang
 
26.motion and feature based person tracking
26.motion and feature based person tracking26.motion and feature based person tracking
26.motion and feature based person trackingsajit1975
 

La actualidad más candente (6)

Scrdet++ analysis
Scrdet++ analysisScrdet++ analysis
Scrdet++ analysis
 
You only look once
You only look onceYou only look once
You only look once
 
Estimating Human Pose from Occluded Images (ACCV 2009)
Estimating Human Pose from Occluded Images (ACCV 2009)Estimating Human Pose from Occluded Images (ACCV 2009)
Estimating Human Pose from Occluded Images (ACCV 2009)
 
Blurclassification
BlurclassificationBlurclassification
Blurclassification
 
26.motion and feature based person tracking
26.motion and feature based person tracking26.motion and feature based person tracking
26.motion and feature based person tracking
 
Lath
LathLath
Lath
 

Similar a Cvpr2007 object category recognition p3 - discriminative models

Mit6870 orsu lecture2
Mit6870 orsu lecture2Mit6870 orsu lecture2
Mit6870 orsu lecture2zukun
 
Object Tracking with Instance Matching and Online Learning
Object Tracking with Instance Matching and Online LearningObject Tracking with Instance Matching and Online Learning
Object Tracking with Instance Matching and Online LearningJui-Hsin (Larry) Lai
 
Yolos you only look one sequence
Yolos you only look one sequenceYolos you only look one sequence
Yolos you only look one sequencetaeseon ryu
 
Talk 2011-buet-perception-event
Talk 2011-buet-perception-eventTalk 2011-buet-perception-event
Talk 2011-buet-perception-eventMahfuzul Haque
 
Moving object detection in video surveillance
Moving object detection in video surveillanceMoving object detection in video surveillance
Moving object detection in video surveillanceAshfaqul Haque John
 
Machine learning cyphort_malware_most_wanted
Machine learning cyphort_malware_most_wantedMachine learning cyphort_malware_most_wanted
Machine learning cyphort_malware_most_wantedCyphort
 
2017 07 03_meetup_d
2017 07 03_meetup_d2017 07 03_meetup_d
2017 07 03_meetup_dDana Brophy
 
2017 07 03_meetup_d
2017 07 03_meetup_d2017 07 03_meetup_d
2017 07 03_meetup_dDana Brophy
 
Lecture 08 larry zitnick - undestanding and describing scenes
Lecture 08   larry zitnick - undestanding and describing scenesLecture 08   larry zitnick - undestanding and describing scenes
Lecture 08 larry zitnick - undestanding and describing scenesmustafa sarac
 
Shai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingShai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingwolf
 
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...Jedha Bootcamp
 
IRJET- Real-Time Object Detection using Deep Learning: A Survey
IRJET- Real-Time Object Detection using Deep Learning: A SurveyIRJET- Real-Time Object Detection using Deep Learning: A Survey
IRJET- Real-Time Object Detection using Deep Learning: A SurveyIRJET Journal
 
Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)Julien SIMON
 
MPerceptron
MPerceptronMPerceptron
MPerceptronbutest
 
Machine learning in Healthcare - WeCloudData
Machine learning in Healthcare - WeCloudDataMachine learning in Healthcare - WeCloudData
Machine learning in Healthcare - WeCloudDataWeCloudData
 

Similar a Cvpr2007 object category recognition p3 - discriminative models (20)

Mit6870 orsu lecture2
Mit6870 orsu lecture2Mit6870 orsu lecture2
Mit6870 orsu lecture2
 
Object Tracking with Instance Matching and Online Learning
Object Tracking with Instance Matching and Online LearningObject Tracking with Instance Matching and Online Learning
Object Tracking with Instance Matching and Online Learning
 
Yolos you only look one sequence
Yolos you only look one sequenceYolos you only look one sequence
Yolos you only look one sequence
 
Talk 2011-buet-perception-event
Talk 2011-buet-perception-eventTalk 2011-buet-perception-event
Talk 2011-buet-perception-event
 
Moving object detection in video surveillance
Moving object detection in video surveillanceMoving object detection in video surveillance
Moving object detection in video surveillance
 
Machine learning cyphort_malware_most_wanted
Machine learning cyphort_malware_most_wantedMachine learning cyphort_malware_most_wanted
Machine learning cyphort_malware_most_wanted
 
2017 07 03_meetup_d
2017 07 03_meetup_d2017 07 03_meetup_d
2017 07 03_meetup_d
 
2017 07 03_meetup_d
2017 07 03_meetup_d2017 07 03_meetup_d
2017 07 03_meetup_d
 
Ai use cases
Ai use casesAi use cases
Ai use cases
 
Presentation of Visual Tracking
Presentation of Visual TrackingPresentation of Visual Tracking
Presentation of Visual Tracking
 
Lecture 08 larry zitnick - undestanding and describing scenes
Lecture 08   larry zitnick - undestanding and describing scenesLecture 08   larry zitnick - undestanding and describing scenes
Lecture 08 larry zitnick - undestanding and describing scenes
 
alexVAE_New.pdf
alexVAE_New.pdfalexVAE_New.pdf
alexVAE_New.pdf
 
Shai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingShai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble tracking
 
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...
Faire de la reconnaissance d'images avec le Deep Learning - Cristina & Pierre...
 
IRJET- Real-Time Object Detection using Deep Learning: A Survey
IRJET- Real-Time Object Detection using Deep Learning: A SurveyIRJET- Real-Time Object Detection using Deep Learning: A Survey
IRJET- Real-Time Object Detection using Deep Learning: A Survey
 
HR3D: Content Adaptive Parallax Barriers
HR3D: Content Adaptive Parallax BarriersHR3D: Content Adaptive Parallax Barriers
HR3D: Content Adaptive Parallax Barriers
 
nnml.ppt
nnml.pptnnml.ppt
nnml.ppt
 
Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)Deep Learning: concepts and use cases (October 2018)
Deep Learning: concepts and use cases (October 2018)
 
MPerceptron
MPerceptronMPerceptron
MPerceptron
 
Machine learning in Healthcare - WeCloudData
Machine learning in Healthcare - WeCloudDataMachine learning in Healthcare - WeCloudData
Machine learning in Healthcare - WeCloudData
 

Más de zukun

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009zukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVzukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Informationzukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statisticszukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibrationzukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionzukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluationzukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-softwarezukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptorszukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectorszukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-introzukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video searchzukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video searchzukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionzukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick startzukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structureszukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities zukun
 

Más de zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 

Último

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 

Último (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 

Cvpr2007 object category recognition p3 - discriminative models

  • 1. Part 3: classifier based methods Antonio Torralba
  • 2.
  • 3. Classifier based methods Object detection and recognition is formulated as a classification problem. … and a decision is taken at each window about if it contains a target object or not. Where are the screens? The image is partitioned into a set of overlapping windows Bag of image patches Decision boundary Computer screen Background In some feature space
  • 4.
  • 5.
  • 6.
  • 7.
  • 9.
  • 10. Discriminative methods 10 6 examples Nearest neighbor Shakhnarovich, Viola, Darrell 2003 Berg, Berg, Malik 2005 … Neural networks LeCun, Bottou, Bengio, Haffner 1998 Rowley, Baluja, Kanade 1998 … Support Vector Machines and Kernels Conditional Random Fields McCallum, Freitag, Pereira 2000 Kumar, Hebert 2003 … Guyon, Vapnik Heisele, Serre, Poggio, 2001 …
  • 11.
  • 12.
  • 13.
  • 14. Boosting Sequential procedure. At each step we add For more details: Friedman, Hastie, Tibshirani. “Additive Logistic Regression: a Statistical View of Boosting” (1998) to minimize the residual loss input Desired output Parameters weak classifier
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21. Haar wavelets Papageorgiou & Poggio (2000) Polynomial SVM
  • 22. Edges and chamfer distance Gavrila, Philomin, ICCV 1999
  • 23. Edge fragments Weak detector = k edge fragments and threshold. Chamfer distance uses 8 orientation planes Opelt, Pinz, Zisserman, ECCV 2006
  • 24.
  • 25.
  • 26. Weak detectors First we collect a set of part templates from a set of training objects. Vidal-Naquet, Ullman, Nature Neuroscience 2003 …
  • 27. Weak detectors We now define a family of “weak detectors” as: = = Better than chance *
  • 28. Weak detectors We can do a better job using filtered images Still a weak detector but better than before * * = = =
  • 29. Training First we evaluate all the N features on all the training images. Then, we sample the feature outputs on the object center and at random locations in the background:
  • 30. Representation and object model Selected features for the screen detector Lousy painter … 4 10 1 2 3 … 100
  • 31. Representation and object model Selected features for the car detector 1 2 3 4 10 100 … …
  • 32.
  • 33. Example: screen detection Feature output
  • 34. Example: screen detection Feature output Thresholded output Weak ‘detector’ Produces many false alarms.
  • 35. Example: screen detection Feature output Thresholded output Strong classifier at iteration 1
  • 36. Example: screen detection Feature output Thresholded output Strong classifier Second weak ‘detector’ Produces a different set of false alarms.
  • 37. Example: screen detection + Feature output Thresholded output Strong classifier Strong classifier at iteration 2
  • 38. Example: screen detection + … Feature output Thresholded output Strong classifier Strong classifier at iteration 10
  • 39. Example: screen detection + … Feature output Thresholded output Strong classifier Adding features Final classification Strong classifier at iteration 200
  • 40. Cascade of classifiers Fleuret and Geman 2001, Viola and Jones 2001 We want the complexity of the 3 features classifier with the performance of the 100 features classifier: 3 features 30 features 100 features Select a threshold with high recall for each stage. We increase precision using the cascade Recall Precision 0% 100% 100%
  • 41.
  • 43. Multiclass object detection Number of classes Amount of labeled data Number of classes Complexity
  • 44.
  • 45.
  • 46. Sharing invariances S. Thrun. Is Learning the n-th Thing Any Easier Than Learning The First? NIPS 1996 “ Knowledge is transferred between tasks via a learned model of the invariances of the domain: object recognition is invariant to rotation, translation, scaling, lighting, … These invariances are common to all object recognition tasks”. Toy world Without sharing With sharing
  • 47.
  • 48. Models of object recognition I. Biederman, “Recognition-by-components: A theory of human image understanding,” Psychological Review , 1987. M. Riesenhuber and T. Poggio, “Hierarchical models of object recognition in cortex,” Nature Neuroscience 1999. T. Serre, L. Wolf and T. Poggio. “Object recognition with features inspired by visual cortex”. CVPR 2005
  • 49. Sharing in constellation models Pictorial Structures Fischler & Elschlager, IEEE Trans. Comp. 1973 Constellation Model Burl, Liung,Perona, 1996; Weber, Welling, Perona, 2000 Fergus, Perona, & Zisserman, CVPR 2003 SVM Detectors Heisele, Poggio, et. al., NIPS 2001 Model-Guided Segmentation Mori, Ren, Efros, & Malik, CVPR 2004
  • 50. Random initialization Variational EM (Attias, Hinton, Beal, etc.) Slide from Fei Fei Li Fei-Fei , Fergus, & Perona, ICCV 2003 E-Step prior knowledge of p(  ) new estimate of p(  |train) M-Step new  ’s
  • 51. Reusable Parts Goal: Look for a vocabulary of edges that reduces the number of features. Krempp, Geman, & Amit “Sequential Learning of Reusable Parts for Object Detection”. TR 2002 Number of features Number of classes Examples of reused parts
  • 52.
  • 53.
  • 54.
  • 55. 50 training samples/class 29 object classes 2000 entries in the dictionary Results averaged on 20 runs Error bars = 80% interval Krempp, Geman, & Amit, 2002 Torralba, Murphy, Freeman. CVPR 2004 Shared features Class-specific features
  • 56. Generalization as a function of object similarities Number of training samples per class Number of training samples per class Area under ROC Area under ROC K = 2.1 K = 4.8 Torralba, Murphy, Freeman. CVPR 2004. PAMI 2007 12 viewpoints 12 unrelated object classes
  • 57. Opelt, Pinz, Zisserman, CVPR 2006 Efficiency Generalization
  • 58.
  • 59. Context based methods Antonio Torralba
  • 60. Why is this hard?
  • 61. What are the hidden objects? 2 1
  • 62. What are the hidden objects? Chance ~ 1/30000
  • 63.
  • 64.  
  • 65.  
  • 66.  
  • 67. Global and local representations building car sidewalk Urban street scene
  • 68. Global and local representations Image index: Summary statistics, configuration of textures Urban street scene features histogram building car sidewalk Urban street scene
  • 69. Global scene representations Spatial structure is important in order to provide context for object localization Sivic, Russell, Freeman, Zisserman, ICCV 2005 Fei-Fei and Perona, CVPR 2005 Bosch, Zisserman, Munoz, ECCV 2006 Bag of words Spatially organized textures Non localized textons S. Lazebnik, et al, CVPR 2006 Walker, Malik. Vision Research 2004 … M. Gorkani, R. Picard, ICPR 1994 A. Oliva, A. Torralba, IJCV 2001 …
  • 70. Contextual object relationships Carbonetto, de Freitas & Barnard (2004) Kumar, Hebert (2005) Torralba Murphy Freeman (2004) Fink & Perona (2003) E. Sudderth et al (2005)
  • 71.
  • 72. 3d Scene Context Image World [Hoiem, Efros, Hebert ICCV 2005]
  • 73. 3d Scene Context Image Support Vertical Sky V-Left V-Center V-Right V-Porous V-Solid [Hoiem, Efros, Hebert ICCV 2005] Object Surface? Support?
  • 74.
  • 75.
  • 76.
  • 77. CRFs Object-Object Relationships [Torralba Murphy Freeman 2004] [Kumar Hebert 2005]
  • 78.
  • 79.
  • 80. A car out of context …
  • 81. Integrated models for scene and object recognition Banksy
  • 82.

Notas del editor

  1. There are multiple flavors of multiclass boosting, here I am going to use the original Adaboost.MH
  2. Strees: the main part is not to discriminate among classes. The hard part is to detect the objects when inmersed in clutter.
  3. These results have been reproduced by opelt with a completely different set of features, and they found the same sublinear scaling for the complexity and the same improved generalization. Although the increase is weak because they use few categories
  4. Despite of being semantically wrong, some basic contextual principles are still preserved: the elephant is not sticking out of the wall.