SlideShare una empresa de Scribd logo
1 de 45
Computer vision: models,
 learning and inference
        Chapter 20
   Models for visual words



   Please send errata to s.prince@cs.ucl.ac.uk
Visual words

• Most models treat data as continuous
• Likelihood based on normal distribution
• Visual words = discrete representation of
  image
• Likelihood based on categorical distribution
• Useful for difficult tasks such as scene
  recognition and object recognition

         Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   2
Motivation: scene recognition




   Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   3
Structure

•   Computing visual words
•   Bag of words model
•   Latent Dirichlet allocation
•   Single author-topic model
•   Constellation model
•   Scene model
•   Applications

           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   4
Computing dictionary of visual words

1. For every one of the I training images, select a
   set of Ji spatial locations.
     •   Interest points
     •   Regular grid
2. Compute a descriptor at each spatial location in
   each image
3. Cluster all of these descriptor vectors into K
   groups using a method such as the K-Means
   algorithm
4. The means of the K clusters are used as the K
   prototype vectors in the dictionary.
          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   5
Encoding images as visual words
1. Select a set of J spatial locations in the image using the same
   method as for the dictionary
2. Compute the descriptor at each of the J spatial locations.
3. Compare each descriptor to the set of K prototype
   descriptors in the dictionary
4. Assign a discrete index to this location that corresponds to
   the index of the closest word in the dictionary.

End result:

         Discrete feature index               x and y position
              Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   6
Structure

•   Computing visual words
•   Bag of words model
•   Latent Dirichlet allocation
•   Single author-topic model
•   Constellation model
•   Scene model
•   Applications

           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   7
Bag of words model
Key idea:

• Abandon all spatial information
• Just represent image by relative frequency
  (histogram) of words from dictionary

                                                                            where




            Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   8
Bag of words




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   9
Structure
Learning (MAP solution):




Inference:




         Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   10
Bag of words for object recognition




      Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   11
Problems with bag of words




  Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   12
Structure

•   Computing visual words
•   Bag of words model
•   Latent Dirichlet allocation
•   Single author-topic model
•   Constellation model
•   Scene model
•   Applications

           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   13
Latent Dirichlet allocation
• Describes relative frequency of visual words in a
  single image (no world term)
• Words not generated independently (connected by
  hidden variable)
• Analogy to text documents
   – Each image contains mixture of several topics (parts)
   – Each topic induces a distribution over words




           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   14
Latent Dirichlet allocation




 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   15
Latent Dirichlet allocation
Generative equations




Marginal distribution over features




Conjugate priors over parameters



          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   16
Latent Dirichlet allocation




 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   17
Learning LDA model
• Part labels      p     hidden variables
• If we knew them then it would be easy to estimate the
  parameters




• How about EM algorithm? Unfortunately, parts within in
  image not independent

          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   18
Latent Dirichlet allocation




 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   19
Learning
Strategy:

1. Write an expression for posterior distribution
   over part labels
2. Draw samples from posterior using MCMC
3. Use samples to estimate parameters




            Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   20
1. Posterior over part labels

                                                                                     Denominator
                                                                                      intractable
Can compute two terms in numerator in closed form




          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince          21
2. Draw samples from posterior
Gibbs’ sampling: fix all part labels except one and sample
from conditional distribution




This can be computed in closed form




        Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   22
3. Use samples to estimate parameters

Samples substitute in for real part labels in update
equations




          Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   23
Structure

•   Computing visual words
•   Bag of words model
•   Latent Dirichlet allocation
•   Single author-topic model
•   Constellation model
•   Scene model
•   Applications

           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   24
Single author topic model




 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   25
Single author-topic model




 Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   26
Learning
1. Posterior over part labels



Likelihood same as before, prior becomes




           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   27
Learning
2. Draw samples from posterior




3. Use samples to estimate parameters




        Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   28
Inference
Likelihood that words in this image are due to
category n




Compute posterior over categories




           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   29
Structure

•   Computing visual words
•   Bag of words model
•   Latent Dirichlet allocation
•   Single author-topic model
•   Constellation model
•   Scene model
•   Applications

           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   30
Problems with bag of words




  Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   31
Constellation model




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   32
Constellation model




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   33
Learning
1. Posterior over part labels



Prior same as before, likelihood becomes




           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   34
Learning
2. Draw samples from posterior




3. Use samples to estimate parameters



 Part and word probabilities as before
              Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   35
Inference
Likelihood that words in this image are due to
category n




Compute posterior over categories




           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   36
Learning




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   37
Structure

•   Computing visual words
•   Bag of words model
•   Latent Dirichlet allocation
•   Single author-topic model
•   Constellation model
•   Scene model
•   Applications

           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   38
Problems with bag of words




  Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   39
Scene model




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   40
Scene model




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   41
Structure

•   Computing visual words
•   Bag of words model
•   Latent Dirichlet allocation
•   Single author-topic model
•   Constellation model
•   Scene model
•   Applications

           Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   42
Video Google




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   43
Action recognition




Spatio-temporal bag of words model 91.8% classification


       Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   44
Action recognition




Computer vision: models, learning and inference. ©2011 Simon J.D. Prince   45

Más contenido relacionado

Destacado

LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)
LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)
LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)rchbeir
 
SpeakerLDA: Discovering Topics in Transcribed Multi-Speaker Audio Contents @ ...
SpeakerLDA: Discovering Topics in Transcribed Multi-Speaker Audio Contents @ ...SpeakerLDA: Discovering Topics in Transcribed Multi-Speaker Audio Contents @ ...
SpeakerLDA: Discovering Topics in Transcribed Multi-Speaker Audio Contents @ ...Damiano Spina
 
Mathematical approach for Text Mining 1
Mathematical approach for Text Mining 1Mathematical approach for Text Mining 1
Mathematical approach for Text Mining 1Kyunghoon Kim
 
Mining Features from the Object-Oriented Source Code of a Collection of Softw...
Mining Features from the Object-Oriented Source Code of a Collection of Softw...Mining Features from the Object-Oriented Source Code of a Collection of Softw...
Mining Features from the Object-Oriented Source Code of a Collection of Softw...Ra'Fat Al-Msie'deen
 
SNAPP - Learning Analytics and Knowledge Conference 2011
SNAPP - Learning Analytics and Knowledge Conference 2011SNAPP - Learning Analytics and Knowledge Conference 2011
SNAPP - Learning Analytics and Knowledge Conference 2011aneeshabakharia
 
A Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine PerceptionA Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine PerceptionCory Andrew Henson
 
Latent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet MixtureLatent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet MixtureRakuten Group, Inc.
 
BigML Summer 2016 Release
BigML Summer 2016 ReleaseBigML Summer 2016 Release
BigML Summer 2016 ReleaseBigML, Inc
 
An approach to source code plagiarism
An approach to source code plagiarismAn approach to source code plagiarism
An approach to source code plagiarismvarsha_bhat
 
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet Processes
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet ProcessesBayesian Nonparametric Topic Modeling Hierarchical Dirichlet Processes
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet ProcessesJinYeong Bak
 
Blei ngjordan2003
Blei ngjordan2003Blei ngjordan2003
Blei ngjordan2003Ajay Ohri
 
How to use Latent Semantic Analysis to Glean Real Insight - Franco Amalfi
How to use Latent Semantic Analysis to Glean Real Insight - Franco AmalfiHow to use Latent Semantic Analysis to Glean Real Insight - Franco Amalfi
How to use Latent Semantic Analysis to Glean Real Insight - Franco AmalfiSocial Media Camp
 
Latent Topic-semantic Indexing based Automatic Text Summarization
Latent Topic-semantic Indexing based Automatic Text SummarizationLatent Topic-semantic Indexing based Automatic Text Summarization
Latent Topic-semantic Indexing based Automatic Text SummarizationElaheh Barati
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
Topic Models, LDA and all that
Topic Models, LDA and all thatTopic Models, LDA and all that
Topic Models, LDA and all thatZhibo Xiao
 
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot..."Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...Davide Chicco
 

Destacado (20)

LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)
LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)
LSI latent (par HATOUM Saria et DONGO ESCALANTE Irvin Franco)
 
SpeakerLDA: Discovering Topics in Transcribed Multi-Speaker Audio Contents @ ...
SpeakerLDA: Discovering Topics in Transcribed Multi-Speaker Audio Contents @ ...SpeakerLDA: Discovering Topics in Transcribed Multi-Speaker Audio Contents @ ...
SpeakerLDA: Discovering Topics in Transcribed Multi-Speaker Audio Contents @ ...
 
Mathematical approach for Text Mining 1
Mathematical approach for Text Mining 1Mathematical approach for Text Mining 1
Mathematical approach for Text Mining 1
 
Practical Machine Learning
Practical Machine Learning Practical Machine Learning
Practical Machine Learning
 
Mining Features from the Object-Oriented Source Code of a Collection of Softw...
Mining Features from the Object-Oriented Source Code of a Collection of Softw...Mining Features from the Object-Oriented Source Code of a Collection of Softw...
Mining Features from the Object-Oriented Source Code of a Collection of Softw...
 
SNAPP - Learning Analytics and Knowledge Conference 2011
SNAPP - Learning Analytics and Knowledge Conference 2011SNAPP - Learning Analytics and Knowledge Conference 2011
SNAPP - Learning Analytics and Knowledge Conference 2011
 
A Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine PerceptionA Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine Perception
 
Latent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet MixtureLatent Semantic Transliteration using Dirichlet Mixture
Latent Semantic Transliteration using Dirichlet Mixture
 
BigML Summer 2016 Release
BigML Summer 2016 ReleaseBigML Summer 2016 Release
BigML Summer 2016 Release
 
An approach to source code plagiarism
An approach to source code plagiarismAn approach to source code plagiarism
An approach to source code plagiarism
 
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet Processes
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet ProcessesBayesian Nonparametric Topic Modeling Hierarchical Dirichlet Processes
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet Processes
 
Blei ngjordan2003
Blei ngjordan2003Blei ngjordan2003
Blei ngjordan2003
 
How to use Latent Semantic Analysis to Glean Real Insight - Franco Amalfi
How to use Latent Semantic Analysis to Glean Real Insight - Franco AmalfiHow to use Latent Semantic Analysis to Glean Real Insight - Franco Amalfi
How to use Latent Semantic Analysis to Glean Real Insight - Franco Amalfi
 
Latent Topic-semantic Indexing based Automatic Text Summarization
Latent Topic-semantic Indexing based Automatic Text SummarizationLatent Topic-semantic Indexing based Automatic Text Summarization
Latent Topic-semantic Indexing based Automatic Text Summarization
 
Naive Bayes | Statistics
Naive Bayes | StatisticsNaive Bayes | Statistics
Naive Bayes | Statistics
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Topic Models, LDA and all that
Topic Models, LDA and all thatTopic Models, LDA and all that
Topic Models, LDA and all that
 
C4.5
C4.5C4.5
C4.5
 
NLP and LSA getting started
NLP and LSA getting startedNLP and LSA getting started
NLP and LSA getting started
 
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot..."Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
"Probabilistic Latent Semantic Analysis for prediction of Gene Ontology annot...
 

Similar a 20 cv mil_models_for_words

17 cv mil_models_for_shape
17 cv mil_models_for_shape17 cv mil_models_for_shape
17 cv mil_models_for_shapezukun
 
09 cv mil_classification
09 cv mil_classification09 cv mil_classification
09 cv mil_classificationzukun
 
13 cv mil_preprocessing
13 cv mil_preprocessing13 cv mil_preprocessing
13 cv mil_preprocessingzukun
 
14 cv mil_the_pinhole_camera
14 cv mil_the_pinhole_camera14 cv mil_the_pinhole_camera
14 cv mil_the_pinhole_camerazukun
 
18 cv mil_style_and_identity
18 cv mil_style_and_identity18 cv mil_style_and_identity
18 cv mil_style_and_identityzukun
 
11 cv mil_models_for_chains_and_trees
11 cv mil_models_for_chains_and_trees11 cv mil_models_for_chains_and_trees
11 cv mil_models_for_chains_and_treeszukun
 
16 cv mil_multiple_cameras
16 cv mil_multiple_cameras16 cv mil_multiple_cameras
16 cv mil_multiple_cameraszukun
 
15 cv mil_models_for_transformations
15 cv mil_models_for_transformations15 cv mil_models_for_transformations
15 cv mil_models_for_transformationszukun
 
Graphical Models for chains, trees and grids
Graphical Models for chains, trees and gridsGraphical Models for chains, trees and grids
Graphical Models for chains, trees and gridspotaters
 
07 cv mil_modeling_complex_densities
07 cv mil_modeling_complex_densities07 cv mil_modeling_complex_densities
07 cv mil_modeling_complex_densitieszukun
 
04 cv mil_fitting_probability_models
04 cv mil_fitting_probability_models04 cv mil_fitting_probability_models
04 cv mil_fitting_probability_modelszukun
 
10 cv mil_graphical_models
10 cv mil_graphical_models10 cv mil_graphical_models
10 cv mil_graphical_modelszukun
 
Mathematical Modeling for Practical Problems
Mathematical Modeling for Practical ProblemsMathematical Modeling for Practical Problems
Mathematical Modeling for Practical ProblemsLiwei Ren任力偉
 
03 cv mil_probability_distributions
03 cv mil_probability_distributions03 cv mil_probability_distributions
03 cv mil_probability_distributionszukun
 
12 cv mil_models_for_grids
12 cv mil_models_for_grids12 cv mil_models_for_grids
12 cv mil_models_for_gridszukun
 
06 cv mil_learning_and_inference
06 cv mil_learning_and_inference06 cv mil_learning_and_inference
06 cv mil_learning_and_inferencezukun
 
Machine Learning: Learning with data
Machine Learning: Learning with dataMachine Learning: Learning with data
Machine Learning: Learning with dataONE Talks
 
One talk Machine Learning
One talk Machine LearningOne talk Machine Learning
One talk Machine LearningONE Talks
 
Common Probability Distibution
Common Probability DistibutionCommon Probability Distibution
Common Probability DistibutionLukas Tencer
 
Context-based movie search using doc2vec, word2vec
Context-based movie search using doc2vec, word2vecContext-based movie search using doc2vec, word2vec
Context-based movie search using doc2vec, word2vecJIN KYU CHANG
 

Similar a 20 cv mil_models_for_words (20)

17 cv mil_models_for_shape
17 cv mil_models_for_shape17 cv mil_models_for_shape
17 cv mil_models_for_shape
 
09 cv mil_classification
09 cv mil_classification09 cv mil_classification
09 cv mil_classification
 
13 cv mil_preprocessing
13 cv mil_preprocessing13 cv mil_preprocessing
13 cv mil_preprocessing
 
14 cv mil_the_pinhole_camera
14 cv mil_the_pinhole_camera14 cv mil_the_pinhole_camera
14 cv mil_the_pinhole_camera
 
18 cv mil_style_and_identity
18 cv mil_style_and_identity18 cv mil_style_and_identity
18 cv mil_style_and_identity
 
11 cv mil_models_for_chains_and_trees
11 cv mil_models_for_chains_and_trees11 cv mil_models_for_chains_and_trees
11 cv mil_models_for_chains_and_trees
 
16 cv mil_multiple_cameras
16 cv mil_multiple_cameras16 cv mil_multiple_cameras
16 cv mil_multiple_cameras
 
15 cv mil_models_for_transformations
15 cv mil_models_for_transformations15 cv mil_models_for_transformations
15 cv mil_models_for_transformations
 
Graphical Models for chains, trees and grids
Graphical Models for chains, trees and gridsGraphical Models for chains, trees and grids
Graphical Models for chains, trees and grids
 
07 cv mil_modeling_complex_densities
07 cv mil_modeling_complex_densities07 cv mil_modeling_complex_densities
07 cv mil_modeling_complex_densities
 
04 cv mil_fitting_probability_models
04 cv mil_fitting_probability_models04 cv mil_fitting_probability_models
04 cv mil_fitting_probability_models
 
10 cv mil_graphical_models
10 cv mil_graphical_models10 cv mil_graphical_models
10 cv mil_graphical_models
 
Mathematical Modeling for Practical Problems
Mathematical Modeling for Practical ProblemsMathematical Modeling for Practical Problems
Mathematical Modeling for Practical Problems
 
03 cv mil_probability_distributions
03 cv mil_probability_distributions03 cv mil_probability_distributions
03 cv mil_probability_distributions
 
12 cv mil_models_for_grids
12 cv mil_models_for_grids12 cv mil_models_for_grids
12 cv mil_models_for_grids
 
06 cv mil_learning_and_inference
06 cv mil_learning_and_inference06 cv mil_learning_and_inference
06 cv mil_learning_and_inference
 
Machine Learning: Learning with data
Machine Learning: Learning with dataMachine Learning: Learning with data
Machine Learning: Learning with data
 
One talk Machine Learning
One talk Machine LearningOne talk Machine Learning
One talk Machine Learning
 
Common Probability Distibution
Common Probability DistibutionCommon Probability Distibution
Common Probability Distibution
 
Context-based movie search using doc2vec, word2vec
Context-based movie search using doc2vec, word2vecContext-based movie search using doc2vec, word2vec
Context-based movie search using doc2vec, word2vec
 

Más de zukun

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009zukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVzukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Informationzukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statisticszukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibrationzukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionzukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluationzukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-softwarezukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptorszukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectorszukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-introzukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video searchzukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video searchzukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionzukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick startzukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structureszukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities zukun
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featureszukun
 

Más de zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 
Icml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant featuresIcml2012 learning hierarchies of invariant features
Icml2012 learning hierarchies of invariant features
 

Último

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 

Último (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 

20 cv mil_models_for_words

  • 1. Computer vision: models, learning and inference Chapter 20 Models for visual words Please send errata to s.prince@cs.ucl.ac.uk
  • 2. Visual words • Most models treat data as continuous • Likelihood based on normal distribution • Visual words = discrete representation of image • Likelihood based on categorical distribution • Useful for difficult tasks such as scene recognition and object recognition Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 2
  • 3. Motivation: scene recognition Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 3
  • 4. Structure • Computing visual words • Bag of words model • Latent Dirichlet allocation • Single author-topic model • Constellation model • Scene model • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 4
  • 5. Computing dictionary of visual words 1. For every one of the I training images, select a set of Ji spatial locations. • Interest points • Regular grid 2. Compute a descriptor at each spatial location in each image 3. Cluster all of these descriptor vectors into K groups using a method such as the K-Means algorithm 4. The means of the K clusters are used as the K prototype vectors in the dictionary. Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 5
  • 6. Encoding images as visual words 1. Select a set of J spatial locations in the image using the same method as for the dictionary 2. Compute the descriptor at each of the J spatial locations. 3. Compare each descriptor to the set of K prototype descriptors in the dictionary 4. Assign a discrete index to this location that corresponds to the index of the closest word in the dictionary. End result: Discrete feature index x and y position Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 6
  • 7. Structure • Computing visual words • Bag of words model • Latent Dirichlet allocation • Single author-topic model • Constellation model • Scene model • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 7
  • 8. Bag of words model Key idea: • Abandon all spatial information • Just represent image by relative frequency (histogram) of words from dictionary where Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 8
  • 9. Bag of words Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 9
  • 10. Structure Learning (MAP solution): Inference: Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 10
  • 11. Bag of words for object recognition Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 11
  • 12. Problems with bag of words Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 12
  • 13. Structure • Computing visual words • Bag of words model • Latent Dirichlet allocation • Single author-topic model • Constellation model • Scene model • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 13
  • 14. Latent Dirichlet allocation • Describes relative frequency of visual words in a single image (no world term) • Words not generated independently (connected by hidden variable) • Analogy to text documents – Each image contains mixture of several topics (parts) – Each topic induces a distribution over words Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 14
  • 15. Latent Dirichlet allocation Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 15
  • 16. Latent Dirichlet allocation Generative equations Marginal distribution over features Conjugate priors over parameters Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 16
  • 17. Latent Dirichlet allocation Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 17
  • 18. Learning LDA model • Part labels p hidden variables • If we knew them then it would be easy to estimate the parameters • How about EM algorithm? Unfortunately, parts within in image not independent Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 18
  • 19. Latent Dirichlet allocation Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 19
  • 20. Learning Strategy: 1. Write an expression for posterior distribution over part labels 2. Draw samples from posterior using MCMC 3. Use samples to estimate parameters Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 20
  • 21. 1. Posterior over part labels Denominator intractable Can compute two terms in numerator in closed form Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 21
  • 22. 2. Draw samples from posterior Gibbs’ sampling: fix all part labels except one and sample from conditional distribution This can be computed in closed form Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 22
  • 23. 3. Use samples to estimate parameters Samples substitute in for real part labels in update equations Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 23
  • 24. Structure • Computing visual words • Bag of words model • Latent Dirichlet allocation • Single author-topic model • Constellation model • Scene model • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 24
  • 25. Single author topic model Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 25
  • 26. Single author-topic model Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 26
  • 27. Learning 1. Posterior over part labels Likelihood same as before, prior becomes Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 27
  • 28. Learning 2. Draw samples from posterior 3. Use samples to estimate parameters Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 28
  • 29. Inference Likelihood that words in this image are due to category n Compute posterior over categories Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 29
  • 30. Structure • Computing visual words • Bag of words model • Latent Dirichlet allocation • Single author-topic model • Constellation model • Scene model • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 30
  • 31. Problems with bag of words Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 31
  • 32. Constellation model Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 32
  • 33. Constellation model Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 33
  • 34. Learning 1. Posterior over part labels Prior same as before, likelihood becomes Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 34
  • 35. Learning 2. Draw samples from posterior 3. Use samples to estimate parameters Part and word probabilities as before Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 35
  • 36. Inference Likelihood that words in this image are due to category n Compute posterior over categories Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 36
  • 37. Learning Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 37
  • 38. Structure • Computing visual words • Bag of words model • Latent Dirichlet allocation • Single author-topic model • Constellation model • Scene model • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 38
  • 39. Problems with bag of words Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 39
  • 40. Scene model Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 40
  • 41. Scene model Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 41
  • 42. Structure • Computing visual words • Bag of words model • Latent Dirichlet allocation • Single author-topic model • Constellation model • Scene model • Applications Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 42
  • 43. Video Google Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 43
  • 44. Action recognition Spatio-temporal bag of words model 91.8% classification Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 44
  • 45. Action recognition Computer vision: models, learning and inference. ©2011 Simon J.D. Prince 45