SlideShare una empresa de Scribd logo
1 de 15
Beyond the Euclidean distance: Creating effective visual codebooks using the histogram intersection kernel Authors: Jianxin Wu and James Rehg @Georgia Institute of Technology Presenter: Shao-Chuan Wang
Beyond the Euclidean distance Key Ideas: Use histogram intersection kernel (HIK) to create the visual codebook due to the fact that most of descriptors are histogram-based features Kernel K-means (using HIK) One-class SVM (using HIK) Conclusions:  One-class SVM with HIK performs the best K-median is the compromise (comparable with HIK K-means)
Background: Bag of Visual Words Codebook construction (Find D) Clustering-based, such as k-means Assignment of descriptors to visual word (Find lpha) Pooling (sum pooling to construct histograms) ←focus of this paper Voronoi diagram Subject to some constraints
Kernel K-means (1/2) Finding the nearest centroidfrom K centroids: Updating the centroids by averaging the new assigned atoms Iteration t:
Kernel K-means (2/2) (1)
Contribution 1: fast evaluation of HIK Based on (Maji et al. 2008) and transforming R^d_+ into N^d, and the evaluation of (1) can be reduced to O(d) ->pre-compute a lookup table!
Contribution 2: Encoding via One-class SVM Example one-class SVM in 2D using Gaussian kernel: Gamma = 0.01, C=2000 Gamma = 0.1, C=2000
Contribution 2: Encoding via One-class SVM Use kernel K-means (with HIK) to create codebook of size K. Train K one-class SVM for each cluster. Assign the word according to the maximum response out of K SVM machines. :Lagrangian multiplier
Contribution 3: Comparison with K-median Codebook K-median clustering: Finding nearest centroid using L1 distance Updating the centroids by finding the median of the updated atoms. ‘Median’ is the minimizer of the following opt. problem,
Some engineering details Pyramid overlapping pooling strategy 31 subwindows => 31K dimension vector
Some engineering details Concatenation of Sobel image Pictures from Wikipedia => 31K*2=62K dimension image representation
Some engineering details SIFT for Caltech, CENTRIST for others Codebook size K = 200 Pyramid level L = 0, 1, 2 Using one-vs-one SVM for smaller dataset, using BSVM for Caltech 101 Random splitting is repeated 5 times.
Results: Caltech 101 B, not B: concatenation of Sobel or not s: grid step size of dense SIFT extraction oc_{svm}: one class SVM encoding k_{HI}: using histogram intersection kernel
Results: Scene 15 B, not B: concatenation of Sobel or not s: grid step size of dense SIFT extraction oc_{svm}: one class SVM encoding k_{HI}: using histogram intersection kernel
Conclusions HIK visual codebook improves classification accuracy. K-median is a compromise between k-means and HIK. One-class SVM encoding helps build a more compact representation Smaller step size is better?

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Object Detection using Deep Neural Networks
Object Detection using Deep Neural NetworksObject Detection using Deep Neural Networks
Object Detection using Deep Neural Networks
 
Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)
 
Masters Thesis
Masters ThesisMasters Thesis
Masters Thesis
 
PCL (Point Cloud Library)
PCL (Point Cloud Library)PCL (Point Cloud Library)
PCL (Point Cloud Library)
 
Mask R-CNN
Mask R-CNNMask R-CNN
Mask R-CNN
 
Fast Non-Uniform Filtering with Symmetric Weighted Integral Images
Fast Non-Uniform Filtering with Symmetric Weighted Integral ImagesFast Non-Uniform Filtering with Symmetric Weighted Integral Images
Fast Non-Uniform Filtering with Symmetric Weighted Integral Images
 
Objects as points
Objects as pointsObjects as points
Objects as points
 
Transfer Learning and Domain Adaptation (D2L3 2017 UPC Deep Learning for Comp...
Transfer Learning and Domain Adaptation (D2L3 2017 UPC Deep Learning for Comp...Transfer Learning and Domain Adaptation (D2L3 2017 UPC Deep Learning for Comp...
Transfer Learning and Domain Adaptation (D2L3 2017 UPC Deep Learning for Comp...
 
A novel technique for speech encryption based on k-means clustering and quant...
A novel technique for speech encryption based on k-means clustering and quant...A novel technique for speech encryption based on k-means clustering and quant...
A novel technique for speech encryption based on k-means clustering and quant...
 
VJAI Paper Reading#3-KDD2019-ClusterGCN
VJAI Paper Reading#3-KDD2019-ClusterGCNVJAI Paper Reading#3-KDD2019-ClusterGCN
VJAI Paper Reading#3-KDD2019-ClusterGCN
 
Deep Learning for Computer Vision: Backward Propagation (UPC 2016)
Deep Learning for Computer Vision: Backward Propagation (UPC 2016)Deep Learning for Computer Vision: Backward Propagation (UPC 2016)
Deep Learning for Computer Vision: Backward Propagation (UPC 2016)
 
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...
 
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
Image Segmentation (D3L1 2017 UPC Deep Learning for Computer Vision)
 
Deep image retrieval - learning global representations for image search - ub ...
Deep image retrieval - learning global representations for image search - ub ...Deep image retrieval - learning global representations for image search - ub ...
Deep image retrieval - learning global representations for image search - ub ...
 
Convolutional Patch Representations for Image Retrieval An unsupervised approach
Convolutional Patch Representations for Image Retrieval An unsupervised approachConvolutional Patch Representations for Image Retrieval An unsupervised approach
Convolutional Patch Representations for Image Retrieval An unsupervised approach
 
computer networking
computer networkingcomputer networking
computer networking
 
Clustering
ClusteringClustering
Clustering
 
Object detection - RCNNs vs Retinanet
Object detection - RCNNs vs RetinanetObject detection - RCNNs vs Retinanet
Object detection - RCNNs vs Retinanet
 
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
Unsupervised Deep Learning (D2L1 Insight@DCU Machine Learning Workshop 2017)
 
Centernet
CenternetCenternet
Centernet
 

Destacado

Importance of history
Importance of historyImportance of history
Importance of history
ping1973
 
Compass Fi Treasury Pp July2008
Compass Fi Treasury Pp July2008Compass Fi Treasury Pp July2008
Compass Fi Treasury Pp July2008
ntrung
 
Adapting the Lean Enterprise Self-Assessment Tool for Software Development Do...
Adapting the Lean Enterprise Self-Assessment Tool for Software Development Do...Adapting the Lean Enterprise Self-Assessment Tool for Software Development Do...
Adapting the Lean Enterprise Self-Assessment Tool for Software Development Do...
Teemu Karvonen
 
Eating and exercise habits of the students in our school
Eating and exercise habits of the students in our schoolEating and exercise habits of the students in our school
Eating and exercise habits of the students in our school
nikseis
 
Normicka's business cards
Normicka's business cardsNormicka's business cards
Normicka's business cards
normicka
 
Spring Cairngorm
Spring CairngormSpring Cairngorm
Spring Cairngorm
devaraj ns
 
The new masters of management
The new masters of managementThe new masters of management
The new masters of management
rsoosaar
 

Destacado (18)

Importance of history
Importance of historyImportance of history
Importance of history
 
Compass Fi Treasury Pp July2008
Compass Fi Treasury Pp July2008Compass Fi Treasury Pp July2008
Compass Fi Treasury Pp July2008
 
SE3221 - Playing the Glong Yao
SE3221 - Playing the Glong YaoSE3221 - Playing the Glong Yao
SE3221 - Playing the Glong Yao
 
Adapting the Lean Enterprise Self-Assessment Tool for Software Development Do...
Adapting the Lean Enterprise Self-Assessment Tool for Software Development Do...Adapting the Lean Enterprise Self-Assessment Tool for Software Development Do...
Adapting the Lean Enterprise Self-Assessment Tool for Software Development Do...
 
NLCMG - Performance is good, Understanding performance is better
NLCMG - Performance is good, Understanding performance is better NLCMG - Performance is good, Understanding performance is better
NLCMG - Performance is good, Understanding performance is better
 
Eating and exercise habits of the students in our school
Eating and exercise habits of the students in our schoolEating and exercise habits of the students in our school
Eating and exercise habits of the students in our school
 
Normicka's business cards
Normicka's business cardsNormicka's business cards
Normicka's business cards
 
Why scala - executive overview
Why scala - executive overviewWhy scala - executive overview
Why scala - executive overview
 
F28 bota5
F28 bota5F28 bota5
F28 bota5
 
米羅
米羅米羅
米羅
 
It’s all about sex
It’s all about sexIt’s all about sex
It’s all about sex
 
05 enclosures
05 enclosures05 enclosures
05 enclosures
 
Spring Cairngorm
Spring CairngormSpring Cairngorm
Spring Cairngorm
 
Budget Simulation Assignment Renee Jackson
Budget Simulation Assignment Renee JacksonBudget Simulation Assignment Renee Jackson
Budget Simulation Assignment Renee Jackson
 
Reading the Campus/Reading the City
Reading the Campus/Reading the CityReading the Campus/Reading the City
Reading the Campus/Reading the City
 
beckys new cv xxxx
beckys new cv xxxxbeckys new cv xxxx
beckys new cv xxxx
 
The new masters of management
The new masters of managementThe new masters of management
The new masters of management
 
Jini new technology for a networked world
Jini new technology for a networked worldJini new technology for a networked world
Jini new technology for a networked world
 

Similar a Beyond The Euclidean Distance: Creating effective visual codebooks using the histogram intersection kernel

Reducing Structural Bias in Technology Mapping
Reducing Structural Bias in Technology MappingReducing Structural Bias in Technology Mapping
Reducing Structural Bias in Technology Mapping
satrajit
 
Fcv learn yu
Fcv learn yuFcv learn yu
Fcv learn yu
zukun
 
Sorting and Routing on Hypercubes and Hypercubic Architectures
Sorting and Routing on Hypercubes and Hypercubic ArchitecturesSorting and Routing on Hypercubes and Hypercubic Architectures
Sorting and Routing on Hypercubes and Hypercubic Architectures
CTOGreenITHub
 
Bridging the Pervasive Computing Gap: An Aggregate Perspective
Bridging the Pervasive Computing Gap: An Aggregate PerspectiveBridging the Pervasive Computing Gap: An Aggregate Perspective
Bridging the Pervasive Computing Gap: An Aggregate Perspective
Roberto Casadei
 
Cs221 lecture6-fall11
Cs221 lecture6-fall11Cs221 lecture6-fall11
Cs221 lecture6-fall11
darwinrlo
 

Similar a Beyond The Euclidean Distance: Creating effective visual codebooks using the histogram intersection kernel (20)

Reducing Structural Bias in Technology Mapping
Reducing Structural Bias in Technology MappingReducing Structural Bias in Technology Mapping
Reducing Structural Bias in Technology Mapping
 
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
Massively Parallel K-Nearest Neighbor Computation on Distributed Architectures
 
Fcv learn yu
Fcv learn yuFcv learn yu
Fcv learn yu
 
Fuzzy Encoding For Image Classification Using Gustafson-Kessel Aglorithm
Fuzzy Encoding For Image Classification Using Gustafson-Kessel AglorithmFuzzy Encoding For Image Classification Using Gustafson-Kessel Aglorithm
Fuzzy Encoding For Image Classification Using Gustafson-Kessel Aglorithm
 
A Scalable Dataflow Implementation of Curran's Approximation Algorithm
A Scalable Dataflow Implementation of Curran's Approximation AlgorithmA Scalable Dataflow Implementation of Curran's Approximation Algorithm
A Scalable Dataflow Implementation of Curran's Approximation Algorithm
 
Recent Advances in Kernel-Based Graph Classification
Recent Advances in Kernel-Based Graph ClassificationRecent Advances in Kernel-Based Graph Classification
Recent Advances in Kernel-Based Graph Classification
 
convolutional_neural_networks in deep learning
convolutional_neural_networks in deep learningconvolutional_neural_networks in deep learning
convolutional_neural_networks in deep learning
 
ECCV WS 2012 (Frank)
ECCV WS 2012 (Frank)ECCV WS 2012 (Frank)
ECCV WS 2012 (Frank)
 
Performance Comparison of K-means Codebook Optimization using different Clust...
Performance Comparison of K-means Codebook Optimization using different Clust...Performance Comparison of K-means Codebook Optimization using different Clust...
Performance Comparison of K-means Codebook Optimization using different Clust...
 
Design Pattern of HBase Configuration
Design Pattern of HBase ConfigurationDesign Pattern of HBase Configuration
Design Pattern of HBase Configuration
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Log Analytics in Datacenter with Apache Spark and Machine Learning
Log Analytics in Datacenter with Apache Spark and Machine LearningLog Analytics in Datacenter with Apache Spark and Machine Learning
Log Analytics in Datacenter with Apache Spark and Machine Learning
 
Log Analytics in Datacenter with Apache Spark and Machine Learning
Log Analytics in Datacenter with Apache Spark and Machine LearningLog Analytics in Datacenter with Apache Spark and Machine Learning
Log Analytics in Datacenter with Apache Spark and Machine Learning
 
ANSSummer2015
ANSSummer2015ANSSummer2015
ANSSummer2015
 
Development of Multi-Level ROM
Development of Multi-Level ROMDevelopment of Multi-Level ROM
Development of Multi-Level ROM
 
Sorting and Routing on Hypercubes and Hypercubic Architectures
Sorting and Routing on Hypercubes and Hypercubic ArchitecturesSorting and Routing on Hypercubes and Hypercubic Architectures
Sorting and Routing on Hypercubes and Hypercubic Architectures
 
Bridging the Pervasive Computing Gap: An Aggregate Perspective
Bridging the Pervasive Computing Gap: An Aggregate PerspectiveBridging the Pervasive Computing Gap: An Aggregate Perspective
Bridging the Pervasive Computing Gap: An Aggregate Perspective
 
[241]large scale search with polysemous codes
[241]large scale search with polysemous codes[241]large scale search with polysemous codes
[241]large scale search with polysemous codes
 
Cs221 lecture6-fall11
Cs221 lecture6-fall11Cs221 lecture6-fall11
Cs221 lecture6-fall11
 
[html5jロボット部 第7回勉強会] Microsoft Cognitive Toolkit (CNTK) Overview
[html5jロボット部 第7回勉強会] Microsoft Cognitive Toolkit (CNTK) Overview[html5jロボット部 第7回勉強会] Microsoft Cognitive Toolkit (CNTK) Overview
[html5jロボット部 第7回勉強会] Microsoft Cognitive Toolkit (CNTK) Overview
 

Más de Shao-Chuan Wang (10)

Book Cover Recognition
Book Cover RecognitionBook Cover Recognition
Book Cover Recognition
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Self Taught Learning
Self Taught LearningSelf Taught Learning
Self Taught Learning
 
A Friendly Guide To Sparse Coding
A Friendly Guide To Sparse CodingA Friendly Guide To Sparse Coding
A Friendly Guide To Sparse Coding
 
An Exemplar Model For Learning Object Classes
An Exemplar Model For Learning Object ClassesAn Exemplar Model For Learning Object Classes
An Exemplar Model For Learning Object Classes
 
Evaluation Of Color Descriptors For Object And Scene
Evaluation Of Color Descriptors For Object And SceneEvaluation Of Color Descriptors For Object And Scene
Evaluation Of Color Descriptors For Object And Scene
 
Spatially Coherent Latent Topic Model For Concurrent Object Segmentation and ...
Spatially Coherent Latent Topic Model For Concurrent Object Segmentation and ...Spatially Coherent Latent Topic Model For Concurrent Object Segmentation and ...
Spatially Coherent Latent Topic Model For Concurrent Object Segmentation and ...
 
Support Vector Machine
Support Vector MachineSupport Vector Machine
Support Vector Machine
 
About Python
About PythonAbout Python
About Python
 
Image Classification And Support Vector Machine
Image Classification And Support Vector MachineImage Classification And Support Vector Machine
Image Classification And Support Vector Machine
 

Último

The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
heathfieldcps1
 

Último (20)

HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
80 ĐỀ THI THỬ TUYỂN SINH TIẾNG ANH VÀO 10 SỞ GD – ĐT THÀNH PHỐ HỒ CHÍ MINH NĂ...
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Wellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptxWellbeing inclusion and digital dystopias.pptx
Wellbeing inclusion and digital dystopias.pptx
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 

Beyond The Euclidean Distance: Creating effective visual codebooks using the histogram intersection kernel

  • 1. Beyond the Euclidean distance: Creating effective visual codebooks using the histogram intersection kernel Authors: Jianxin Wu and James Rehg @Georgia Institute of Technology Presenter: Shao-Chuan Wang
  • 2. Beyond the Euclidean distance Key Ideas: Use histogram intersection kernel (HIK) to create the visual codebook due to the fact that most of descriptors are histogram-based features Kernel K-means (using HIK) One-class SVM (using HIK) Conclusions: One-class SVM with HIK performs the best K-median is the compromise (comparable with HIK K-means)
  • 3. Background: Bag of Visual Words Codebook construction (Find D) Clustering-based, such as k-means Assignment of descriptors to visual word (Find lpha) Pooling (sum pooling to construct histograms) ←focus of this paper Voronoi diagram Subject to some constraints
  • 4. Kernel K-means (1/2) Finding the nearest centroidfrom K centroids: Updating the centroids by averaging the new assigned atoms Iteration t:
  • 6. Contribution 1: fast evaluation of HIK Based on (Maji et al. 2008) and transforming R^d_+ into N^d, and the evaluation of (1) can be reduced to O(d) ->pre-compute a lookup table!
  • 7. Contribution 2: Encoding via One-class SVM Example one-class SVM in 2D using Gaussian kernel: Gamma = 0.01, C=2000 Gamma = 0.1, C=2000
  • 8. Contribution 2: Encoding via One-class SVM Use kernel K-means (with HIK) to create codebook of size K. Train K one-class SVM for each cluster. Assign the word according to the maximum response out of K SVM machines. :Lagrangian multiplier
  • 9. Contribution 3: Comparison with K-median Codebook K-median clustering: Finding nearest centroid using L1 distance Updating the centroids by finding the median of the updated atoms. ‘Median’ is the minimizer of the following opt. problem,
  • 10. Some engineering details Pyramid overlapping pooling strategy 31 subwindows => 31K dimension vector
  • 11. Some engineering details Concatenation of Sobel image Pictures from Wikipedia => 31K*2=62K dimension image representation
  • 12. Some engineering details SIFT for Caltech, CENTRIST for others Codebook size K = 200 Pyramid level L = 0, 1, 2 Using one-vs-one SVM for smaller dataset, using BSVM for Caltech 101 Random splitting is repeated 5 times.
  • 13. Results: Caltech 101 B, not B: concatenation of Sobel or not s: grid step size of dense SIFT extraction oc_{svm}: one class SVM encoding k_{HI}: using histogram intersection kernel
  • 14. Results: Scene 15 B, not B: concatenation of Sobel or not s: grid step size of dense SIFT extraction oc_{svm}: one class SVM encoding k_{HI}: using histogram intersection kernel
  • 15. Conclusions HIK visual codebook improves classification accuracy. K-median is a compromise between k-means and HIK. One-class SVM encoding helps build a more compact representation Smaller step size is better?