SlideShare una empresa de Scribd logo
1 de 49
Visual Attention: Detecting Saliency on Images Vicente Ordonez Department of Computer Science State University of New York Stony Brook, NY 11790
I will be working mainly on the following paper Learning to Detect a Salient Object. T. Liu, J. Sun, N. Zheng, X. Tang, H. Shum. (Xian Jiaotong University and Microsoft Research Asia) from CVPR 2007.  http://research.microsoft.com/en-us/um/people/jiansun/papers/SalientDetection_CVPR07.pdf
What is Saliency? What is Visual Attention? “Everyone knows what attention is...” —William James, 1890
This is a problem of… Arbitrary object detection? Background / Foreground segmentation? Modeling Visual Attention?
The Method Features:  Multiscale Contrast    (Done!) Center surround histogram   (Mostly Done!) (Done!) Color spatial distribution (Done!) Supervised learning using Conditional Random Fields to determine the parameters to combine the features obtained above.  (Done!) [I will use a labeled dataset of 5000 images provided by Microsoft Research Asia!]
Multiscale Contrast Function Generate the Gaussian Pyramid for the input image. For each level in the pyramid  Do gaussian blurring Do resampling I’m using a 6 levels Gaussian pyramid for each RGB channel.
How a Gaussian pyramid looks like Figure from David Forsyth
Generate contrast maps for each level of the Pyramid. Sum all of the results to produce the final multiscale contrast map. The two steps mentioned above are described in this formula: Multiscale Contrast Function
Input image
Contrast maps
Contrast maps Original image Contrast map at level 1 Contrast map at level 4 Contrast map at level 6
Multiscale Contrast Map Output
Center Surround Histogram Feature ,[object Object]
For each possible rectangle with a reasonable size and aspect ratio
Create a surrounding rectangle and calculate the histogram of the rectangle and the surrounding area.
Pick and record the rectangle that maximizes the Chi-Square distance between the two histograms calculated above and also record the Chi-Square distance.,[object Object]
Center Surround Histogram Feature The algorithm as described before is computationally expensive…  It is required to use a technique called Integral Histogram. It allows you fast calculation of the histogram of any given rectangular region of an image. The algorithm was introduced in: “Integral Histogram: A Fast Way to Extract Histograms in Cartesian Spaces” by FatihPorikli, Mitsubishi Electric Research Lab in CVPR 2005.
Center Surround Histogram Feature Use the Chi Square Distances Map and the Map of Most Salient Rectangle Regions per pixel to generate the Center Surround Histogram Feature using the next formula:
Center Surround Histogram Results Using my Implementation        (15.2 sec, size = 245x384) Results Reported in the Paper
Center Surround Histogram Results Using my Implementation        (13.6 sec, size = 247x346) Results Reported in the Paper
Center Surround Histogram Results Using my Implementation        (10.2 sec, size = 248x277)
More Results
More Results
More results
More Results
More Results
More Results
More Results
More Results
More Results
More Results
More Results
Color Spatial Distribution
Color Spatial Distribution Make an initial clustering of the colors in the image using k-means.  Further refine the clusters by using Gaussian Mixture Models. The Gaussian Mixture Model parameters are calculated using the EM algorithm. I am using 5 clusters (5 colors) per image. And the results look similar to those presented in the paper with an execution time of around 17 seconds per image.
Color Spatial Distribution Calculate the vertical variance of the horizontal positions of the pixels for each cluster. And then the same for the vertical positions.  Sum the variances and use this value to weight more those clusters with less spatial variance. Penalize the clusters that contain the majority of its pixels away from the center of the image.
Color Spatial Distribution
Color Spatial Distribution
Color Spatial Distribution
Color Spatial Distribution
Color Spatial Distribution
Color Spatial Distribution
Color Spatial Distribution
Color Spatial Distribution
Combine Features Together
Conditional Random Field Training and Inference Accelerated Training of Conditional Random Fields with Stochastic Meta-Descent S Vishwanathan, N. Schraudolph, M. Schmidt, K. Murphy. ICML'06 (Intl Conf on Machine Learning).  I did the training using this toolbox from the above paper: http://people.cs.ubc.ca/~murphyk/Software/CRF/crf.html
Mask outputs using CRF inference Input                  M-Contrast-map         Center Surr. Hist.       Color Spatial Var. Input                      Combined features                    Ground truth
Mask outputs using CRF inference Input                  M-Contrast-map         Center Surr. Hist.       Color Spatial Var. Input                      Combined features                    Ground truth
Mask outputs using CRF inference Input                  M-Contrast-map         Center Surr. Hist.       Color Spatial Var. Input                 Combined features        Ground truth
Mask outputs using CRF inference Input                  M-Contrast-map         Center Surr. Hist.       Color Spatial Var. Input                 Combined features        Ground truth

Más contenido relacionado

La actualidad más candente

End-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersEnd-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersSeunghyun Hwang
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep LearningOleg Mygryn
 
SVM Tutorial
SVM TutorialSVM Tutorial
SVM Tutorialbutest
 
Object detection and Instance Segmentation
Object detection and Instance SegmentationObject detection and Instance Segmentation
Object detection and Instance SegmentationHichem Felouat
 
Image classification using CNN
Image classification using CNNImage classification using CNN
Image classification using CNNNoura Hussein
 
Object Detection Classification, tracking and Counting
Object Detection Classification, tracking and CountingObject Detection Classification, tracking and Counting
Object Detection Classification, tracking and CountingShounak Mitra
 
Object Detection with Transformers
Object Detection with TransformersObject Detection with Transformers
Object Detection with TransformersDatabricks
 
Image processing
Image processingImage processing
Image processingPooja G N
 
Statistical classification: A review on some techniques
Statistical classification: A review on some techniquesStatistical classification: A review on some techniques
Statistical classification: A review on some techniquesGiorgos Bamparopoulos
 
Image classification using convolutional neural network
Image classification using convolutional neural networkImage classification using convolutional neural network
Image classification using convolutional neural networkKIRAN R
 
PPT on BRAIN TUMOR detection in MRI images based on IMAGE SEGMENTATION
PPT on BRAIN TUMOR detection in MRI images based on  IMAGE SEGMENTATION PPT on BRAIN TUMOR detection in MRI images based on  IMAGE SEGMENTATION
PPT on BRAIN TUMOR detection in MRI images based on IMAGE SEGMENTATION khanam22
 
3rd sem ppt for wavelet
3rd sem ppt for wavelet3rd sem ppt for wavelet
3rd sem ppt for waveletgandimare
 
Neural net and back propagation
Neural net and back propagationNeural net and back propagation
Neural net and back propagationMohit Shrivastava
 
Computer Graphics Programes
Computer Graphics ProgramesComputer Graphics Programes
Computer Graphics ProgramesAbhishek Sharma
 
Kalman filters
Kalman filtersKalman filters
Kalman filtersAJAL A J
 
Classification using back propagation algorithm
Classification using back propagation algorithmClassification using back propagation algorithm
Classification using back propagation algorithmKIRAN R
 
Kernel density estimation (kde)
Kernel density estimation (kde)Kernel density estimation (kde)
Kernel density estimation (kde)Padma Metta
 

La actualidad más candente (20)

End-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersEnd-to-End Object Detection with Transformers
End-to-End Object Detection with Transformers
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
SVM Tutorial
SVM TutorialSVM Tutorial
SVM Tutorial
 
Object detection and Instance Segmentation
Object detection and Instance SegmentationObject detection and Instance Segmentation
Object detection and Instance Segmentation
 
Image classification using CNN
Image classification using CNNImage classification using CNN
Image classification using CNN
 
Object Detection Classification, tracking and Counting
Object Detection Classification, tracking and CountingObject Detection Classification, tracking and Counting
Object Detection Classification, tracking and Counting
 
Instance Segmentation - Míriam Bellver - UPC Barcelona 2018
Instance Segmentation - Míriam Bellver - UPC Barcelona 2018Instance Segmentation - Míriam Bellver - UPC Barcelona 2018
Instance Segmentation - Míriam Bellver - UPC Barcelona 2018
 
Object Detection with Transformers
Object Detection with TransformersObject Detection with Transformers
Object Detection with Transformers
 
Image processing
Image processingImage processing
Image processing
 
Statistical classification: A review on some techniques
Statistical classification: A review on some techniquesStatistical classification: A review on some techniques
Statistical classification: A review on some techniques
 
Image classification using convolutional neural network
Image classification using convolutional neural networkImage classification using convolutional neural network
Image classification using convolutional neural network
 
Object Detection and Ship Classification Using YOLOv5
Object Detection and Ship Classification Using YOLOv5Object Detection and Ship Classification Using YOLOv5
Object Detection and Ship Classification Using YOLOv5
 
PPT on BRAIN TUMOR detection in MRI images based on IMAGE SEGMENTATION
PPT on BRAIN TUMOR detection in MRI images based on  IMAGE SEGMENTATION PPT on BRAIN TUMOR detection in MRI images based on  IMAGE SEGMENTATION
PPT on BRAIN TUMOR detection in MRI images based on IMAGE SEGMENTATION
 
3rd sem ppt for wavelet
3rd sem ppt for wavelet3rd sem ppt for wavelet
3rd sem ppt for wavelet
 
Neural net and back propagation
Neural net and back propagationNeural net and back propagation
Neural net and back propagation
 
Computer Graphics Programes
Computer Graphics ProgramesComputer Graphics Programes
Computer Graphics Programes
 
Kalman filters
Kalman filtersKalman filters
Kalman filters
 
Object detection
Object detectionObject detection
Object detection
 
Classification using back propagation algorithm
Classification using back propagation algorithmClassification using back propagation algorithm
Classification using back propagation algorithm
 
Kernel density estimation (kde)
Kernel density estimation (kde)Kernel density estimation (kde)
Kernel density estimation (kde)
 

Destacado

Iccv11 salientobjectdetection
Iccv11 salientobjectdetectionIccv11 salientobjectdetection
Iccv11 salientobjectdetectionJie Feng
 
Salient Point Detection
Salient Point DetectionSalient Point Detection
Salient Point DetectionTylerTK
 
Visual attention
Visual attentionVisual attention
Visual attentionannakalme
 
Visual Attention & Processing with Visual-Only IM
Visual Attention & Processing with Visual-Only IMVisual Attention & Processing with Visual-Only IM
Visual Attention & Processing with Visual-Only IMInteractive Metronome
 
Visual attention: models and performance
Visual attention: models and performanceVisual attention: models and performance
Visual attention: models and performanceOlivier Le Meur
 

Destacado (6)

Iccv11 salientobjectdetection
Iccv11 salientobjectdetectionIccv11 salientobjectdetection
Iccv11 salientobjectdetection
 
Salient Point Detection
Salient Point DetectionSalient Point Detection
Salient Point Detection
 
Visual attention
Visual attentionVisual attention
Visual attention
 
Visual Attention & Processing with Visual-Only IM
Visual Attention & Processing with Visual-Only IMVisual Attention & Processing with Visual-Only IM
Visual Attention & Processing with Visual-Only IM
 
Chris Atherton at TCUK09
Chris Atherton at TCUK09Chris Atherton at TCUK09
Chris Atherton at TCUK09
 
Visual attention: models and performance
Visual attention: models and performanceVisual attention: models and performance
Visual attention: models and performance
 

Similar a Visual Saliency: Learning to Detect Salient Objects

Mirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image ProcessingMirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image ProcessingMeetupDataScienceRoma
 
Conception_et_realisation_dun_site_Web_d.pdf
Conception_et_realisation_dun_site_Web_d.pdfConception_et_realisation_dun_site_Web_d.pdf
Conception_et_realisation_dun_site_Web_d.pdfSofianeHassine2
 
Miniproject final group 14
Miniproject final group 14Miniproject final group 14
Miniproject final group 14Ashish Mundhra
 
Unsupervised Building Extraction from High Resolution Satellite Images Irresp...
Unsupervised Building Extraction from High Resolution Satellite Images Irresp...Unsupervised Building Extraction from High Resolution Satellite Images Irresp...
Unsupervised Building Extraction from High Resolution Satellite Images Irresp...CSCJournals
 
Currency recognition on mobile phones
Currency recognition on mobile phonesCurrency recognition on mobile phones
Currency recognition on mobile phoneshabeebsab
 
Fisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous DrivingFisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous DrivingYu Huang
 
Introduction to Binocular Stereo in Computer Vision
Introduction to Binocular Stereo in Computer VisionIntroduction to Binocular Stereo in Computer Vision
Introduction to Binocular Stereo in Computer Visionothersk46
 
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013Sunando Sengupta
 
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNAutomatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNZihao(Gerald) Zhang
 
A Survey on Exemplar-Based Image Inpainting Techniques
A Survey on Exemplar-Based Image Inpainting TechniquesA Survey on Exemplar-Based Image Inpainting Techniques
A Survey on Exemplar-Based Image Inpainting Techniquesijsrd.com
 
Video Stitching using Improved RANSAC and SIFT
Video Stitching using Improved RANSAC and SIFTVideo Stitching using Improved RANSAC and SIFT
Video Stitching using Improved RANSAC and SIFTIRJET Journal
 
Design and implementation of video tracking system based on camera field of view
Design and implementation of video tracking system based on camera field of viewDesign and implementation of video tracking system based on camera field of view
Design and implementation of video tracking system based on camera field of viewsipij
 
Super Resolution of Image
Super Resolution of ImageSuper Resolution of Image
Super Resolution of ImageSatheesh K
 
Remotely sensed image segmentation using multiphase level set acm
Remotely sensed image segmentation using multiphase level set acmRemotely sensed image segmentation using multiphase level set acm
Remotely sensed image segmentation using multiphase level set acmKriti Bajpai
 
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUESA STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUEScscpconf
 

Similar a Visual Saliency: Learning to Detect Salient Objects (20)

Praseed Pai
Praseed PaiPraseed Pai
Praseed Pai
 
Mirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image ProcessingMirko Lucchese - Deep Image Processing
Mirko Lucchese - Deep Image Processing
 
Conception_et_realisation_dun_site_Web_d.pdf
Conception_et_realisation_dun_site_Web_d.pdfConception_et_realisation_dun_site_Web_d.pdf
Conception_et_realisation_dun_site_Web_d.pdf
 
Miniproject final group 14
Miniproject final group 14Miniproject final group 14
Miniproject final group 14
 
Unsupervised Building Extraction from High Resolution Satellite Images Irresp...
Unsupervised Building Extraction from High Resolution Satellite Images Irresp...Unsupervised Building Extraction from High Resolution Satellite Images Irresp...
Unsupervised Building Extraction from High Resolution Satellite Images Irresp...
 
Lw3620362041
Lw3620362041Lw3620362041
Lw3620362041
 
Currency recognition on mobile phones
Currency recognition on mobile phonesCurrency recognition on mobile phones
Currency recognition on mobile phones
 
Fisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous DrivingFisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous Driving
 
Introduction to Binocular Stereo in Computer Vision
Introduction to Binocular Stereo in Computer VisionIntroduction to Binocular Stereo in Computer Vision
Introduction to Binocular Stereo in Computer Vision
 
Normal Mapping / Computer Graphics - IK
Normal Mapping / Computer Graphics - IKNormal Mapping / Computer Graphics - IK
Normal Mapping / Computer Graphics - IK
 
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
 
IEEE ICAPR 2009
IEEE ICAPR 2009IEEE ICAPR 2009
IEEE ICAPR 2009
 
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNNAutomatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
Automatic Detection of Window Regions in Indoor Point Clouds Using R-CNN
 
A Survey on Exemplar-Based Image Inpainting Techniques
A Survey on Exemplar-Based Image Inpainting TechniquesA Survey on Exemplar-Based Image Inpainting Techniques
A Survey on Exemplar-Based Image Inpainting Techniques
 
Video Stitching using Improved RANSAC and SIFT
Video Stitching using Improved RANSAC and SIFTVideo Stitching using Improved RANSAC and SIFT
Video Stitching using Improved RANSAC and SIFT
 
Design and implementation of video tracking system based on camera field of view
Design and implementation of video tracking system based on camera field of viewDesign and implementation of video tracking system based on camera field of view
Design and implementation of video tracking system based on camera field of view
 
Super Resolution of Image
Super Resolution of ImageSuper Resolution of Image
Super Resolution of Image
 
Remotely sensed image segmentation using multiphase level set acm
Remotely sensed image segmentation using multiphase level set acmRemotely sensed image segmentation using multiphase level set acm
Remotely sensed image segmentation using multiphase level set acm
 
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUESA STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
A STUDY AND ANALYSIS OF DIFFERENT EDGE DETECTION TECHNIQUES
 
Av4301248253
Av4301248253Av4301248253
Av4301248253
 

Más de Vicente Ordonez

From Large Scale Image Categorization to Entry-Level Categories
From Large Scale Image Categorization to Entry-Level CategoriesFrom Large Scale Image Categorization to Entry-Level Categories
From Large Scale Image Categorization to Entry-Level CategoriesVicente Ordonez
 
Data-driven Generation of Image Descriptions
Data-driven Generation of Image DescriptionsData-driven Generation of Image Descriptions
Data-driven Generation of Image DescriptionsVicente Ordonez
 
Im2Text: Describing Images Using 1 Million Captioned Photographs
Im2Text: Describing Images Using 1 Million Captioned PhotographsIm2Text: Describing Images Using 1 Million Captioned Photographs
Im2Text: Describing Images Using 1 Million Captioned PhotographsVicente Ordonez
 
Contenido Generado Por Los Usuarios
Contenido Generado Por Los UsuariosContenido Generado Por Los Usuarios
Contenido Generado Por Los UsuariosVicente Ordonez
 
Google Earth Maps Api Barcamp Quito 2009
Google Earth Maps Api Barcamp Quito 2009Google Earth Maps Api Barcamp Quito 2009
Google Earth Maps Api Barcamp Quito 2009Vicente Ordonez
 
Sistema de Recuperacion de Audio
Sistema de Recuperacion de AudioSistema de Recuperacion de Audio
Sistema de Recuperacion de AudioVicente Ordonez
 
Transmision de Vídeo por Red / Internet
Transmision de Vídeo por Red / InternetTransmision de Vídeo por Red / Internet
Transmision de Vídeo por Red / InternetVicente Ordonez
 
Buscadores de Podcast en Internet
Buscadores de Podcast en InternetBuscadores de Podcast en Internet
Buscadores de Podcast en InternetVicente Ordonez
 
Portal Concepts and .NET Webparts
Portal Concepts and .NET WebpartsPortal Concepts and .NET Webparts
Portal Concepts and .NET WebpartsVicente Ordonez
 

Más de Vicente Ordonez (16)

From Large Scale Image Categorization to Entry-Level Categories
From Large Scale Image Categorization to Entry-Level CategoriesFrom Large Scale Image Categorization to Entry-Level Categories
From Large Scale Image Categorization to Entry-Level Categories
 
Data-driven Generation of Image Descriptions
Data-driven Generation of Image DescriptionsData-driven Generation of Image Descriptions
Data-driven Generation of Image Descriptions
 
Im2Text: Describing Images Using 1 Million Captioned Photographs
Im2Text: Describing Images Using 1 Million Captioned PhotographsIm2Text: Describing Images Using 1 Million Captioned Photographs
Im2Text: Describing Images Using 1 Million Captioned Photographs
 
Texture Synthesis
Texture SynthesisTexture Synthesis
Texture Synthesis
 
Contenido Generado Por Los Usuarios
Contenido Generado Por Los UsuariosContenido Generado Por Los Usuarios
Contenido Generado Por Los Usuarios
 
Pantallas Plasma vs LCD
Pantallas Plasma vs LCDPantallas Plasma vs LCD
Pantallas Plasma vs LCD
 
Google Earth Maps Api Barcamp Quito 2009
Google Earth Maps Api Barcamp Quito 2009Google Earth Maps Api Barcamp Quito 2009
Google Earth Maps Api Barcamp Quito 2009
 
Sistema de Recuperacion de Audio
Sistema de Recuperacion de AudioSistema de Recuperacion de Audio
Sistema de Recuperacion de Audio
 
Suenaemprendevive
SuenaemprendeviveSuenaemprendevive
Suenaemprendevive
 
MapReduce
MapReduceMapReduce
MapReduce
 
Robotica
RoboticaRobotica
Robotica
 
Transmision de Vídeo por Red / Internet
Transmision de Vídeo por Red / InternetTransmision de Vídeo por Red / Internet
Transmision de Vídeo por Red / Internet
 
Buscadores de Podcast en Internet
Buscadores de Podcast en InternetBuscadores de Podcast en Internet
Buscadores de Podcast en Internet
 
Sistemas Operativos 3D
Sistemas Operativos 3DSistemas Operativos 3D
Sistemas Operativos 3D
 
Ajax Atlas
Ajax AtlasAjax Atlas
Ajax Atlas
 
Portal Concepts and .NET Webparts
Portal Concepts and .NET WebpartsPortal Concepts and .NET Webparts
Portal Concepts and .NET Webparts
 

Último

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 

Último (20)

Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 

Visual Saliency: Learning to Detect Salient Objects

  • 1. Visual Attention: Detecting Saliency on Images Vicente Ordonez Department of Computer Science State University of New York Stony Brook, NY 11790
  • 2. I will be working mainly on the following paper Learning to Detect a Salient Object. T. Liu, J. Sun, N. Zheng, X. Tang, H. Shum. (Xian Jiaotong University and Microsoft Research Asia) from CVPR 2007. http://research.microsoft.com/en-us/um/people/jiansun/papers/SalientDetection_CVPR07.pdf
  • 3. What is Saliency? What is Visual Attention? “Everyone knows what attention is...” —William James, 1890
  • 4. This is a problem of… Arbitrary object detection? Background / Foreground segmentation? Modeling Visual Attention?
  • 5. The Method Features: Multiscale Contrast (Done!) Center surround histogram (Mostly Done!) (Done!) Color spatial distribution (Done!) Supervised learning using Conditional Random Fields to determine the parameters to combine the features obtained above. (Done!) [I will use a labeled dataset of 5000 images provided by Microsoft Research Asia!]
  • 6. Multiscale Contrast Function Generate the Gaussian Pyramid for the input image. For each level in the pyramid Do gaussian blurring Do resampling I’m using a 6 levels Gaussian pyramid for each RGB channel.
  • 7. How a Gaussian pyramid looks like Figure from David Forsyth
  • 8. Generate contrast maps for each level of the Pyramid. Sum all of the results to produce the final multiscale contrast map. The two steps mentioned above are described in this formula: Multiscale Contrast Function
  • 11. Contrast maps Original image Contrast map at level 1 Contrast map at level 4 Contrast map at level 6
  • 13.
  • 14. For each possible rectangle with a reasonable size and aspect ratio
  • 15. Create a surrounding rectangle and calculate the histogram of the rectangle and the surrounding area.
  • 16.
  • 17. Center Surround Histogram Feature The algorithm as described before is computationally expensive… It is required to use a technique called Integral Histogram. It allows you fast calculation of the histogram of any given rectangular region of an image. The algorithm was introduced in: “Integral Histogram: A Fast Way to Extract Histograms in Cartesian Spaces” by FatihPorikli, Mitsubishi Electric Research Lab in CVPR 2005.
  • 18. Center Surround Histogram Feature Use the Chi Square Distances Map and the Map of Most Salient Rectangle Regions per pixel to generate the Center Surround Histogram Feature using the next formula:
  • 19. Center Surround Histogram Results Using my Implementation (15.2 sec, size = 245x384) Results Reported in the Paper
  • 20. Center Surround Histogram Results Using my Implementation (13.6 sec, size = 247x346) Results Reported in the Paper
  • 21. Center Surround Histogram Results Using my Implementation (10.2 sec, size = 248x277)
  • 34. Color Spatial Distribution Make an initial clustering of the colors in the image using k-means. Further refine the clusters by using Gaussian Mixture Models. The Gaussian Mixture Model parameters are calculated using the EM algorithm. I am using 5 clusters (5 colors) per image. And the results look similar to those presented in the paper with an execution time of around 17 seconds per image.
  • 35. Color Spatial Distribution Calculate the vertical variance of the horizontal positions of the pixels for each cluster. And then the same for the vertical positions. Sum the variances and use this value to weight more those clusters with less spatial variance. Penalize the clusters that contain the majority of its pixels away from the center of the image.
  • 45. Conditional Random Field Training and Inference Accelerated Training of Conditional Random Fields with Stochastic Meta-Descent S Vishwanathan, N. Schraudolph, M. Schmidt, K. Murphy. ICML'06 (Intl Conf on Machine Learning).  I did the training using this toolbox from the above paper: http://people.cs.ubc.ca/~murphyk/Software/CRF/crf.html
  • 46. Mask outputs using CRF inference Input M-Contrast-map Center Surr. Hist. Color Spatial Var. Input Combined features Ground truth
  • 47. Mask outputs using CRF inference Input M-Contrast-map Center Surr. Hist. Color Spatial Var. Input Combined features Ground truth
  • 48. Mask outputs using CRF inference Input M-Contrast-map Center Surr. Hist. Color Spatial Var. Input Combined features Ground truth
  • 49. Mask outputs using CRF inference Input M-Contrast-map Center Surr. Hist. Color Spatial Var. Input Combined features Ground truth
  • 50. Precision / Recall obtained
  • 51. Some Conclusions The results of the original research paper on computing the visual features have been successfully replicated in a considerable extent. The Conditional Random Field framework used in this project turned out to perform well for this task. The center-surround histogram map turned out to be the feature that gave the higher precision. The amount of time required for computing the individual features is in the order of several seconds.

Notas del editor

  1. Not so good result
  2. Good result
  3. Not so good result