SlideShare una empresa de Scribd logo
1 de 45
IIIT HYDERABAD
Techniques for Organization and
Visualization of Community
Photo Collections
Kumar Srijan
Faculty Advisor : Dr. C.V. Jawahar
IIIT HYDERABAD
Community Photo Collections
• Capture
• Share
• Search
IIIT HYDERABAD
Community Photo Collections
• Golkonda Fort (Google Images + Flickr)
– > 50 K images
IIIT HYDERABAD
Applications of CPCs
• Snavely et al. – Siggraph `06, ICCV `09
– Virtual Tourism, Visualization
• Sattler et al. – ICCV `11 , ECCV `12
– Geolocalization
• Goesele et al. – ICCV `07, VMV `11
– Dense 3D Reconstruction
• Frahm et al. – ECCV `08, ECCV `10
– City Scale Reconstructions, Exploration
IIIT HYDERABAD
Processing CPCs
Computing Correspondences
Feature
Extraction
Pairwise
Feature
Matching
Match
Refinement
Track
Creation
Incremental SfM
Seeding
Add new images and
triangulate new points
Bundle
adjust
Snavely et. al, Photo Tourism: Exploring image collections in 3D
Full Scene Reconstruction
IIIT HYDERABAD
Issues
• Quadratic Image Matching cost
• Global scene reconstruction
– O(N4) in the worst case
– Sensitivity to the choice of the initial pair
– Cascading of errors
Image credits: Snavely et. al, Photo Tourism: Exploring image collections in 3D
IIIT HYDERABAD
Timing Breakdown
Snavely et. al, Photo Tourism: Exploring image collections in 3D
Full Scene Reconstruction for Trafalgar Square
with 8000 images took > 50 days
IIIT HYDERABAD
Motivation
• CPCs are unstructured
– Different resolutions, viewpoint , lighting
conditions…
– Very limited number of images match
• Contribution 1 : Matching
– Exhaustive pairwise matching w/o quadratic cost
• Contribution 2 : Visualization
– Framework for bypassing the issues faced with
Incremental Sfm.
IIIT HYDERABAD
Image Matching Problem
• Compute Image Match Graph
– Images Nodes
– Image Match Edges
• Queries:
– Connected components
– Shortest path
IIIT HYDERABAD
Discovering Matching Images
• Object Retrieval with Large Vocabularies and
Fast Spatial Matching – Philbin et al.
• Image Retrieval
1. Indexing
– Quantization : Image Features  Visual Words(VW)
– Inverted Index : over VWs
2. Querying
– Filtering  Shortlist of Top Scoring matches
– Verification  of shortlist
• O(N) time for a single querying
IIIT HYDERABAD
Discovering Matching Images
• Large Scale Discovery of Spatially Related
Images - Chum, O. and Matas. J
IIIT HYDERABAD
Our Solution : Overview
• Exhaustive Pairwise Matching
– Query each image in turn
• Goal : O(1) per query
• Addressing Exhaustiveness
– Verify all potential matches : No shortlists
– Verification doable from Index retrievals
• Our Main Result : Indexing geometry allows both!
IIIT HYDERABAD
Indexing Geometry
• High Order Features
– Combine nearby features
• Primary with Secondary Features
• Encode Affine Invariants
– HOF is a Tuple
• <VWp,VWs,g1,g2,g3,g4>
• Huge Feature Space
IIIT HYDERABAD
Constant Time Queries using HOFs
• Regular Inverted Index
– Posting lists grow with Database size O(N)
• HOF => Huge Feature Space ( > 1012 )
– Reproject with Hash Functions!
• Use Bloom Filters
– Range α Database size
• Constant sized posting lists
• Result : Constant time queries
IIIT HYDERABAD
Spatial Verification
• Computable from index retrievals
– For a query primary feature
• Search all secondary features in database images
• Pass if R features are found.
IIIT HYDERABAD
Solution : Summary
• Extract HoF in the N database images
• Select Reprojection size as CN
• Initialize an Index of size CN
• Indexing
– Key : Hash value of HoF
– Value : Image Id
• Query : Each image in turn
– Record matches in adjacency list
• Result : Image Match Graph
IIIT HYDERABAD
Results
• UK benchmark
– 2550 categories x 4 = 10400 images
– 73.2 % recall
–Large Scale Discovery of Spatially Related
Images (Min Hash based solution)
• 49.6 % recall
IIIT HYDERABAD
Results
Oxford 5K Oxford 105K Oxford 105K
#HOF 78 Mn 1480 Mn 1480 Mn
Index Size 32 Mn 500 Mn 1 Bn
Feature Extraction
Time
27 min 8 hours 8 hours
Query Time per
Image
0.024 sec 0.085 sec 0.061 sec
Query Time 2 min 2.5 hours 1.8 hours
Clusters Found 317 2147 2147
Images Registered 1375 7198 7198
IIIT HYDERABAD
Results
• Small Clusters
• Errors
IIIT HYDERABAD
Visualizing CPCs
IIIT HYDERABAD
Problem Statement
• Efficiently browse and keep Incorporating
incoming stream of images
IIIT HYDERABAD
Our Solution : Overview
• Observation : In a walkthrough, users primarily
see nearby overlapping images.
• Advantages:
– Robustness to errors in incremental SfM module
– Worst case linear running time
– Scalable
– Incremental
Independent Partial Scene Reconstructions
instead of
Global Scene Reconstruction
IIIT HYDERABAD
Partial Reconstructions
ImageMatch
Compute Matches Refine Matches Compute partial
Reconstructions
Standard
SfM
Correct Match
Incorrect Match
IIIT HYDERABAD
Visualization Interface
User interface and navigation
Input images Verified neighbors
Sample
image
Partial reconstruction
IIIT HYDERABAD
Incremental insertion
New Image
Match
Geometric
Verification
Compute Partial
Scene
Reconstruction
IIIT HYDERABAD
Dataset
Fort Dataset
5989 images
Golconda Fort, Hyderabad
IIIT HYDERABAD
Results
IIIT HYDERABAD
Results
IIIT HYDERABAD
Results
• Courtyard Dataset with
687 images
• Initialized with 200
images
• Added 487 image one
by one
• Largest CC of 674
images.
IIIT HYDERABAD
Conclusions
• Image Matching : HOFs gives a larger feature
space which can be reprojected to obtain
sparse posting lists making Exhaustive
Pairwise Matching feasible.
• CPCs Visualization : Partial scene
reconstructions can effectively be used to
navigate through large collections of images.
IIIT HYDERABAD
Thank you!
• QUESTIONS ?!
• Take Home Message : 2 ideas
– For information retrieval using an inverted index,
combining features gives a larger feature space
which can be reprojected to control the average
lengths of posting lists, and thus the query time.
– For a very complex algorithm O(N > 2), it may
sometimes be meaningful to fragment the dataset
into O(N) groups, each of finite size, there by
reducing the overall complexity to O(N).
IIIT HYDERABAD
Thank You!
• Questions
IIIT HYDERABAD
Backup Slides
IIIT HYDERABAD
Photo Tourism
• Annotation Transfer
IIIT HYDERABAD
Matching images
• Correspondence computation
• Match Verification
– RANSAC based epipolar geometry estimation
– Expensive
IIIT HYDERABAD
Establishing Correspondences
• SIFT features : D. Lowe
– Scale Invariant Feature Transform
– Key points
• Detection
• Description : 128D
• Correspondence
– Key points with Similar descriptors
• Alternatives : SURF, Brisk..
IIIT HYDERABAD
Image Retrieval
• Feature
Quantization
– Visual Words
A
B
C
D
E
F
G
A B C D E F G
IIIT HYDERABAD
Image Retrieval
• Feature
Quantization
– Visual Words
• Inverted
Indexing
Visual Word Image Ids
A 0, 1, 3, 4, 7
B 0, 1, 2, 5, 8, 9
C 1, 3, 6, 8
D 1, 2, 4, 6, 8
E 2, 4, 6, 9
F 3, 4, 6, 8, 9
...
Query visual
Word (E)
IIIT HYDERABAD
Image Retrieval
• Feature
Quantization
– Visual Words
• Inverted
Indexing
• Geometric
verification
– Epipolar
Geometry
IIIT HYDERABAD
Bloom Filters
• Bloom Filter
– Set Membership
– Bit array(m)
– Hash Functions(k)
– Elements(n)
• Insert(A)
0
0
0
0
0
0
0
0
0
0
0
0
H1
H2
H3
A
IIIT HYDERABAD
Bloom Filters
• Bloom Filter
– Set Membership
– Bit array(m)
– Hash Functions(k)
– Elements(n)
• Insert(A)
0
1
0
0
0
0
0
0
1
1
0
0
H1
H2
H3
A
IIIT HYDERABAD
Bloom Filters
• Bloom Filter
– Set Membership
– Bit array(m)
– Hash Functions(k)
– Elements(n)
• Insert(A)
• Insert(B)
0
1
0
0
1
0
0
0
1
1
1
0
H1
H2
H3
B
IIIT HYDERABAD
Bloom Filters
• Bloom Filter
– Set Membership
– Bit array(m)
– Hash Functions(k)
– Elements(n)
• Insert(A)
• Insert(B)
• Query(C)
– Not present
0
1
0
0
1
0
0
0
1
1
1
0
H1
H2
H3
C
Set = {A,B}
IIIT HYDERABAD
Bloom Filters
• Bloom Filter
– Set Membership
– Bit array(m)
– Hash Functions(k)
– Elements(n)
• Insert(A)
• Insert(B)
• Query(C)
– Not present
• Query(D)
– False positive
0
1
0
0
1
0
0
0
1
1
1
0
H1
H2
H3
D
Set = {A,B}
IIIT HYDERABAD
Global vs. Partial
• Global : Allows transition to any image
• Partial : Allows transition to a limited number
of overlapping images
• A -> B implies B -> A
A
B B
A

Más contenido relacionado

Similar a Visualizing Community Photo Collections Using Partial Scene Reconstructions

Processing Large Graphs
Processing Large GraphsProcessing Large Graphs
Processing Large GraphsNishant Gandhi
 
Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...
Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...
Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...Aritra Sarkar
 
ADF 3D Laser Scanning
ADF 3D Laser ScanningADF 3D Laser Scanning
ADF 3D Laser ScanningadminADF
 
Chengqi zhang graph processing and mining in the era of big data
Chengqi zhang graph processing and mining in the era of big dataChengqi zhang graph processing and mining in the era of big data
Chengqi zhang graph processing and mining in the era of big datajins0618
 
3-d interpretation from single 2-d image IV
3-d interpretation from single 2-d image IV3-d interpretation from single 2-d image IV
3-d interpretation from single 2-d image IVYu Huang
 
TechnicalBackgroundOverview
TechnicalBackgroundOverviewTechnicalBackgroundOverview
TechnicalBackgroundOverviewMotaz El-Saban
 
Total Knockout: Start-to-Finish Development of Suitability Applications Using...
Total Knockout: Start-to-Finish Development of Suitability Applications Using...Total Knockout: Start-to-Finish Development of Suitability Applications Using...
Total Knockout: Start-to-Finish Development of Suitability Applications Using...Blue Raster
 
Application of image processing.ppt
Application of image processing.pptApplication of image processing.ppt
Application of image processing.pptDevesh448679
 
Smart environment for industry 4.0
Smart environment for industry 4.0Smart environment for industry 4.0
Smart environment for industry 4.0JawadSajid2
 
NetVLAD: CNN architecture for weakly supervised place recognition
NetVLAD:  CNN architecture for weakly supervised place recognitionNetVLAD:  CNN architecture for weakly supervised place recognition
NetVLAD: CNN architecture for weakly supervised place recognitionGeunhee Cho
 
IncQuery-D: Incremental Queries in the Cloud
IncQuery-D: Incremental Queries in the CloudIncQuery-D: Incremental Queries in the Cloud
IncQuery-D: Incremental Queries in the CloudGábor Szárnyas
 
Crowd sourcing gis for global urban area mapping
Crowd sourcing gis for global urban area mappingCrowd sourcing gis for global urban area mapping
Crowd sourcing gis for global urban area mappingHiroyuki Miyazaki
 
PR098: MegaDepth: Learning Single-View Depth Prediction from Internet Photos
PR098: MegaDepth: Learning Single-View Depth Prediction from Internet PhotosPR098: MegaDepth: Learning Single-View Depth Prediction from Internet Photos
PR098: MegaDepth: Learning Single-View Depth Prediction from Internet Photos광희 이
 
Human Action Recognition Using 3D Joint Information and HOOFD Features
Human Action Recognition Using 3D Joint Information and HOOFD FeaturesHuman Action Recognition Using 3D Joint Information and HOOFD Features
Human Action Recognition Using 3D Joint Information and HOOFD FeaturesBarış Üstündağ
 

Similar a Visualizing Community Photo Collections Using Partial Scene Reconstructions (20)

Processing Large Graphs
Processing Large GraphsProcessing Large Graphs
Processing Large Graphs
 
Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...
Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...
Elevation mapping using stereo vision enabled heterogeneous multi-agent robot...
 
pydataPointCloud.pptx
pydataPointCloud.pptxpydataPointCloud.pptx
pydataPointCloud.pptx
 
PPT s11-machine vision-s2
PPT s11-machine vision-s2PPT s11-machine vision-s2
PPT s11-machine vision-s2
 
ADF 3D Laser Scanning
ADF 3D Laser ScanningADF 3D Laser Scanning
ADF 3D Laser Scanning
 
ADF 3D Laser Scanning
ADF 3D Laser ScanningADF 3D Laser Scanning
ADF 3D Laser Scanning
 
ADF 3D Laser Scanning
ADF 3D Laser ScanningADF 3D Laser Scanning
ADF 3D Laser Scanning
 
IEEE Projects 2014-2015
IEEE Projects 2014-2015IEEE Projects 2014-2015
IEEE Projects 2014-2015
 
Chengqi zhang graph processing and mining in the era of big data
Chengqi zhang graph processing and mining in the era of big dataChengqi zhang graph processing and mining in the era of big data
Chengqi zhang graph processing and mining in the era of big data
 
3-d interpretation from single 2-d image IV
3-d interpretation from single 2-d image IV3-d interpretation from single 2-d image IV
3-d interpretation from single 2-d image IV
 
TechnicalBackgroundOverview
TechnicalBackgroundOverviewTechnicalBackgroundOverview
TechnicalBackgroundOverview
 
Total Knockout: Start-to-Finish Development of Suitability Applications Using...
Total Knockout: Start-to-Finish Development of Suitability Applications Using...Total Knockout: Start-to-Finish Development of Suitability Applications Using...
Total Knockout: Start-to-Finish Development of Suitability Applications Using...
 
Application of image processing.ppt
Application of image processing.pptApplication of image processing.ppt
Application of image processing.ppt
 
Smart environment for industry 4.0
Smart environment for industry 4.0Smart environment for industry 4.0
Smart environment for industry 4.0
 
Basic image processing techniques
Basic image processing techniquesBasic image processing techniques
Basic image processing techniques
 
NetVLAD: CNN architecture for weakly supervised place recognition
NetVLAD:  CNN architecture for weakly supervised place recognitionNetVLAD:  CNN architecture for weakly supervised place recognition
NetVLAD: CNN architecture for weakly supervised place recognition
 
IncQuery-D: Incremental Queries in the Cloud
IncQuery-D: Incremental Queries in the CloudIncQuery-D: Incremental Queries in the Cloud
IncQuery-D: Incremental Queries in the Cloud
 
Crowd sourcing gis for global urban area mapping
Crowd sourcing gis for global urban area mappingCrowd sourcing gis for global urban area mapping
Crowd sourcing gis for global urban area mapping
 
PR098: MegaDepth: Learning Single-View Depth Prediction from Internet Photos
PR098: MegaDepth: Learning Single-View Depth Prediction from Internet PhotosPR098: MegaDepth: Learning Single-View Depth Prediction from Internet Photos
PR098: MegaDepth: Learning Single-View Depth Prediction from Internet Photos
 
Human Action Recognition Using 3D Joint Information and HOOFD Features
Human Action Recognition Using 3D Joint Information and HOOFD FeaturesHuman Action Recognition Using 3D Joint Information and HOOFD Features
Human Action Recognition Using 3D Joint Information and HOOFD Features
 

Último

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Último (20)

What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

Visualizing Community Photo Collections Using Partial Scene Reconstructions

  • 1. IIIT HYDERABAD Techniques for Organization and Visualization of Community Photo Collections Kumar Srijan Faculty Advisor : Dr. C.V. Jawahar
  • 2. IIIT HYDERABAD Community Photo Collections • Capture • Share • Search
  • 3. IIIT HYDERABAD Community Photo Collections • Golkonda Fort (Google Images + Flickr) – > 50 K images
  • 4. IIIT HYDERABAD Applications of CPCs • Snavely et al. – Siggraph `06, ICCV `09 – Virtual Tourism, Visualization • Sattler et al. – ICCV `11 , ECCV `12 – Geolocalization • Goesele et al. – ICCV `07, VMV `11 – Dense 3D Reconstruction • Frahm et al. – ECCV `08, ECCV `10 – City Scale Reconstructions, Exploration
  • 5. IIIT HYDERABAD Processing CPCs Computing Correspondences Feature Extraction Pairwise Feature Matching Match Refinement Track Creation Incremental SfM Seeding Add new images and triangulate new points Bundle adjust Snavely et. al, Photo Tourism: Exploring image collections in 3D Full Scene Reconstruction
  • 6. IIIT HYDERABAD Issues • Quadratic Image Matching cost • Global scene reconstruction – O(N4) in the worst case – Sensitivity to the choice of the initial pair – Cascading of errors Image credits: Snavely et. al, Photo Tourism: Exploring image collections in 3D
  • 7. IIIT HYDERABAD Timing Breakdown Snavely et. al, Photo Tourism: Exploring image collections in 3D Full Scene Reconstruction for Trafalgar Square with 8000 images took > 50 days
  • 8. IIIT HYDERABAD Motivation • CPCs are unstructured – Different resolutions, viewpoint , lighting conditions… – Very limited number of images match • Contribution 1 : Matching – Exhaustive pairwise matching w/o quadratic cost • Contribution 2 : Visualization – Framework for bypassing the issues faced with Incremental Sfm.
  • 9. IIIT HYDERABAD Image Matching Problem • Compute Image Match Graph – Images Nodes – Image Match Edges • Queries: – Connected components – Shortest path
  • 10. IIIT HYDERABAD Discovering Matching Images • Object Retrieval with Large Vocabularies and Fast Spatial Matching – Philbin et al. • Image Retrieval 1. Indexing – Quantization : Image Features  Visual Words(VW) – Inverted Index : over VWs 2. Querying – Filtering  Shortlist of Top Scoring matches – Verification  of shortlist • O(N) time for a single querying
  • 11. IIIT HYDERABAD Discovering Matching Images • Large Scale Discovery of Spatially Related Images - Chum, O. and Matas. J
  • 12. IIIT HYDERABAD Our Solution : Overview • Exhaustive Pairwise Matching – Query each image in turn • Goal : O(1) per query • Addressing Exhaustiveness – Verify all potential matches : No shortlists – Verification doable from Index retrievals • Our Main Result : Indexing geometry allows both!
  • 13. IIIT HYDERABAD Indexing Geometry • High Order Features – Combine nearby features • Primary with Secondary Features • Encode Affine Invariants – HOF is a Tuple • <VWp,VWs,g1,g2,g3,g4> • Huge Feature Space
  • 14. IIIT HYDERABAD Constant Time Queries using HOFs • Regular Inverted Index – Posting lists grow with Database size O(N) • HOF => Huge Feature Space ( > 1012 ) – Reproject with Hash Functions! • Use Bloom Filters – Range α Database size • Constant sized posting lists • Result : Constant time queries
  • 15. IIIT HYDERABAD Spatial Verification • Computable from index retrievals – For a query primary feature • Search all secondary features in database images • Pass if R features are found.
  • 16. IIIT HYDERABAD Solution : Summary • Extract HoF in the N database images • Select Reprojection size as CN • Initialize an Index of size CN • Indexing – Key : Hash value of HoF – Value : Image Id • Query : Each image in turn – Record matches in adjacency list • Result : Image Match Graph
  • 17. IIIT HYDERABAD Results • UK benchmark – 2550 categories x 4 = 10400 images – 73.2 % recall –Large Scale Discovery of Spatially Related Images (Min Hash based solution) • 49.6 % recall
  • 18. IIIT HYDERABAD Results Oxford 5K Oxford 105K Oxford 105K #HOF 78 Mn 1480 Mn 1480 Mn Index Size 32 Mn 500 Mn 1 Bn Feature Extraction Time 27 min 8 hours 8 hours Query Time per Image 0.024 sec 0.085 sec 0.061 sec Query Time 2 min 2.5 hours 1.8 hours Clusters Found 317 2147 2147 Images Registered 1375 7198 7198
  • 19. IIIT HYDERABAD Results • Small Clusters • Errors
  • 21. IIIT HYDERABAD Problem Statement • Efficiently browse and keep Incorporating incoming stream of images
  • 22. IIIT HYDERABAD Our Solution : Overview • Observation : In a walkthrough, users primarily see nearby overlapping images. • Advantages: – Robustness to errors in incremental SfM module – Worst case linear running time – Scalable – Incremental Independent Partial Scene Reconstructions instead of Global Scene Reconstruction
  • 23. IIIT HYDERABAD Partial Reconstructions ImageMatch Compute Matches Refine Matches Compute partial Reconstructions Standard SfM Correct Match Incorrect Match
  • 24. IIIT HYDERABAD Visualization Interface User interface and navigation Input images Verified neighbors Sample image Partial reconstruction
  • 25. IIIT HYDERABAD Incremental insertion New Image Match Geometric Verification Compute Partial Scene Reconstruction
  • 26. IIIT HYDERABAD Dataset Fort Dataset 5989 images Golconda Fort, Hyderabad
  • 29. IIIT HYDERABAD Results • Courtyard Dataset with 687 images • Initialized with 200 images • Added 487 image one by one • Largest CC of 674 images.
  • 30. IIIT HYDERABAD Conclusions • Image Matching : HOFs gives a larger feature space which can be reprojected to obtain sparse posting lists making Exhaustive Pairwise Matching feasible. • CPCs Visualization : Partial scene reconstructions can effectively be used to navigate through large collections of images.
  • 31. IIIT HYDERABAD Thank you! • QUESTIONS ?! • Take Home Message : 2 ideas – For information retrieval using an inverted index, combining features gives a larger feature space which can be reprojected to control the average lengths of posting lists, and thus the query time. – For a very complex algorithm O(N > 2), it may sometimes be meaningful to fragment the dataset into O(N) groups, each of finite size, there by reducing the overall complexity to O(N).
  • 34. IIIT HYDERABAD Photo Tourism • Annotation Transfer
  • 35. IIIT HYDERABAD Matching images • Correspondence computation • Match Verification – RANSAC based epipolar geometry estimation – Expensive
  • 36. IIIT HYDERABAD Establishing Correspondences • SIFT features : D. Lowe – Scale Invariant Feature Transform – Key points • Detection • Description : 128D • Correspondence – Key points with Similar descriptors • Alternatives : SURF, Brisk..
  • 37. IIIT HYDERABAD Image Retrieval • Feature Quantization – Visual Words A B C D E F G A B C D E F G
  • 38. IIIT HYDERABAD Image Retrieval • Feature Quantization – Visual Words • Inverted Indexing Visual Word Image Ids A 0, 1, 3, 4, 7 B 0, 1, 2, 5, 8, 9 C 1, 3, 6, 8 D 1, 2, 4, 6, 8 E 2, 4, 6, 9 F 3, 4, 6, 8, 9 ... Query visual Word (E)
  • 39. IIIT HYDERABAD Image Retrieval • Feature Quantization – Visual Words • Inverted Indexing • Geometric verification – Epipolar Geometry
  • 40. IIIT HYDERABAD Bloom Filters • Bloom Filter – Set Membership – Bit array(m) – Hash Functions(k) – Elements(n) • Insert(A) 0 0 0 0 0 0 0 0 0 0 0 0 H1 H2 H3 A
  • 41. IIIT HYDERABAD Bloom Filters • Bloom Filter – Set Membership – Bit array(m) – Hash Functions(k) – Elements(n) • Insert(A) 0 1 0 0 0 0 0 0 1 1 0 0 H1 H2 H3 A
  • 42. IIIT HYDERABAD Bloom Filters • Bloom Filter – Set Membership – Bit array(m) – Hash Functions(k) – Elements(n) • Insert(A) • Insert(B) 0 1 0 0 1 0 0 0 1 1 1 0 H1 H2 H3 B
  • 43. IIIT HYDERABAD Bloom Filters • Bloom Filter – Set Membership – Bit array(m) – Hash Functions(k) – Elements(n) • Insert(A) • Insert(B) • Query(C) – Not present 0 1 0 0 1 0 0 0 1 1 1 0 H1 H2 H3 C Set = {A,B}
  • 44. IIIT HYDERABAD Bloom Filters • Bloom Filter – Set Membership – Bit array(m) – Hash Functions(k) – Elements(n) • Insert(A) • Insert(B) • Query(C) – Not present • Query(D) – False positive 0 1 0 0 1 0 0 0 1 1 1 0 H1 H2 H3 D Set = {A,B}
  • 45. IIIT HYDERABAD Global vs. Partial • Global : Allows transition to any image • Partial : Allows transition to a limited number of overlapping images • A -> B implies B -> A A B B A

Notas del editor

  1. This work is on Techniques for Organization and visualization Of Community ph coll. Also known as CPC’s.
  2. CPC’s are large image collections which are captured using a variety of cameras mobile phones and shared on photosharing websites such as flick and facebook. These images can be search back using Google image search.
  3. A search of Golkonda Fort on Google images and Flickr returns more than 50k images, which have captured this monument from various viewpoints and under different lighting conditions.
  4. There has been some prior work in this area by Noah Snavely et. Al. in a project called Photo Tourism. The aim of this project is to provide an interface where a user could virtually tour a popular landmark captured in a CPC. This has also given rise to a commercial product called Photosynth. (Demo?)Sattler et al. have used CPCs to find the geographics locations of query images.Goesele et al. used CPCs to do dense scene reconstructions of the scene captured in the photographs.Frahm et al. have used these Photo collection to do city scale reconstructions using millions of images, and provide an intuitive method for Exploring the CPCs by computing iconic images.// If photosynth demo is done-- As you can see here, Every image is shown in a geometic context to its neighbouring images, which gives a sense of virtually visiting this place. I can click on nearby images and a geometrically consistent transition is shown. Apart from this, I can also look at a 3d point cloud of the of the scene captured in the CPC, which gives a quick overall understanding of the scene.
  5. I will outline the major steps required for Processing CPCs for computing Scene Recontructions in order to do Virtual Tourism and Exploration.Thefirst step is to establish feature correspondences across images. Robust features, ex SiFT, are extracted in all the images. In every pair of images, the matching features are found by comparing feature descriptions. Since this level of matching tends to be fault prone. A refinement step is employed which uses epipolar geometry constraint to verify matches. The consistent matches spanning across more than 2 images are saved as tracks. The purpose of the next step is to produce a 3D point corresponding to each of these tracks. For this a procedure known as Incremental SfM is used. SfM is a classic computer vision problem which tries to estimate the 3D structure of the scene and the relative calibration of the cameras which took the images using a small set of input images for which no prior information is available.For CPC’s an incremental version of this algorithm is used which starts by seeding the scene reconstruction with a pair of images having a large number matching feature using standard SfM pipeline. This is followed by the incremental addition of new images in batches to reconstruction, and triangulation of new points in them. A procedure known as bundle adjustment is performed everytime a new batch of images are added to the reconstruction to minimize errors.This gives rise to a full scene reconstruction in which all the image and this corresponding 3D points are registered in a single frame of reference.
  6. This techniques proposed have some issues, most evident being the quadratic Image Matching cost which is incurred as every image is matched with every other image. The incremental SFM procedure used can be 4th power in time complexity. This technique is sensitive to the choice of seed which is used to initialize the reconstruction. Finally, Cascading of errors can happen which can lead to large sections of reconstruction getting inaccurate as shown in the figure.
  7. Timing breakdown for various dataset has shown that the image matching and Incremental SfM dominate the overall time taken for processing CPCs. Infact, for Trafalgar Square dataset with 8k images, the procedure did not stop even after 50 days.
  8. Our motivation comes from the unstructured nature of CPC’s since the constituent image have different resolutions, view points and lighting conditions. As a result only a very few number of images match. Keeping this in mind we make the following two contributions – First is to do exhaustive pairwise matching without incurring a quadratic time complexity.Second is to build a Visualization framework which can by pass the issues faced by Incremental SfM.
  9. The aim of the image matching problem is to compute an image match graph, where images are shown as Nodes and a pair of matching images are shown as edges. This kind of a structure could easily facilitate queries such as connected components or Shortest path as shown in the figure.
  10. Philbin et al. have describe a way to solve this problem using Large feature vocabularies to decrease the descriptor matching cost. This follows a standard image retrieval pipeline in which all the images are indexed into a Database. And later queried one by one to find their matches. Indexing is done by quantizing the image features using a large vocabulary of visual words, which are nothing but quantization levels in the image feature space. And inverted index is built upon these visual words to list all the image which contain a particular visual word. Querying takes place by querying, in the inverted index, all the visual words present in the query image, and giving score to the images in the posting list. A shortlisting of the top scoring matches is done and these matches are later verified using epipolar geometry constraints. Since the size of the posting lists grow with the size of indexed database, this procedure has linear time complexity for for each query image, and thus an overall quadratic time complexity.
  11. Chum et al. describe a process of Discovering matching image in using Min Hash procedure. In this, images are treated as sets of visual words contained in them. A signature of every image is computed by employing K min Hash functions. The signatures are sampled into small sketches of size 3 or 4 and are inserted into hash table to facilitate matching. All the sketches falling into the same hash table bucket are deemed matching, and images with matching sketches are sent for geometric verification. These verified image matches act as seeds for discovering clusters of matching images, by using standard image Retrieval and Query expansion techniques. However, due to its sampling nature, this technique has a high chance of missing out small clusters.
  12. Our solution, to this problem is to do Exhaustive Pairwise matching, by querying each image in turn. The goal we set is to have each query done in constant time, thereby ensuring an overall linear time complexity. We address the exhaustiveness by verifying all potential matches avoiding any sort of shortlisting. Since, standard geometric verification is a time consuming process we have built a verification scheme which can be evaluated directly from inverted index retrievals. Overall, our main result is that Indexing geometry allows us to meet all the above requirements!
  13. We capture the geometric information in images by using High order features, which in our case are a concatenation of 2 nearby image features and their relative geometry. Our procedure for extracting High Order Features involves selecting a few of the features satisfying a certain scale space criteria as Primary features and then clubbing them with nearby secondary features, by computing Affine Invariants between them, as shown in the figure. A high order feature therefore is represented as a Tuple containing the Visual words for the primary and the secondary features followed by the quantized geometric parameters. The side effect is that it gives a huge feature space, which we will see is crucial to creating an index which will support fast queries.
  14. To see how this is done, I go back to Image retrieval using inverted index. In a regular inverted index, the number of visual words is limited, therefore the posting lists grow linearly with the size of the database. With HoF we have a huge feature space which can be reprojected to any desired size and thus control the density of the posting lists.We choose the reprojected size of feature space be in proportion with the size of the database. To realize this, we define equally sized bloom filters, using only one hash function, for each of the images in the database and build an inverted index over the bit positions of the bloom filters. As a result, the size of the posting lists of the inverted index does not increase with the size of the database. Therefore, building an inverted index over this space would result in constant time queries, so as I had mentioned before, we do not need to shortlisting of potential candidates. Next, we will see how these potential matching images are geometrically verified.
  15. Fast spatial verification is crucial to for fast querying. Our scheme achieves this by doing this directly from the index retrievals. For this, given a query, we query all the Hof related to a primary feature in succession to look for matches in the database images. If we find 4 or more such matches in a database image, then we consider that primary feature to be reliably matching the database image. Finding 2 or more such reliable primary feature matches confirms image matching.
  16. In summary, our algorithm for computing Image match graph works as follows : We compute High order features in all database images. Then select a Reprojection size which is in proportion the database size to initialize inverted index. For indexing, the Hash value of the HoFeatures act as the Key, and the correponding value is kept as the image id. We query each image in turn and record matches in the adjacency list, which is our Image match graph.
  17. To test the effectiveness of our approach in finding small clusters of matching images, we used the University of Kentucky benchmark dataset which has 4 images each of 2550 objects. We consider a object to have been discovered if atleast on of the matching images is correctly discovered. We were able to get 73.2 % recall, as compared to an earlier discussed Min Hash based solution which gave recall of 49.6%.
  18. This table describes the time breakdown of our solution on the oxford 5k and 105k image datasets. We see that by using a moderate index sizes such as 32 million for 5k images, querying time per image goes down to 0.024s, bringing the total query time to 2 mins. We were able to discover 317 clusters having 1375 images in them. For a large oxford 105k dataset, we increased the index size to 500 Mn, to obtain a querying time 85ms, which was reduced to 61 ms on doubling the index size to 1Bn. This also brought the total querying time for 105k queries to under 2 hours. We 2147 clusters having 7198 images in them.
  19. This image shows some the small clusters discovered in the oxford dataset. We also found some error which were mostly due to text and window like structures.
  20. Our second contribution is for visualizing CPC’s as walkthroughs, while also keeping in mind that community photo collections tend to increase in size over time.
  21. More formally we want to efficiently browse and keep incorporating an incoming stream of images. This is what our solution provides for browsing such image collections looks like – you will notice that the experience is similar to that of Photosynth, but as we will see next, we are able to bypass the issues faced by it.
  22. Our approach is based on the observation that in awalkthrough, users primarily observe nearby overlapping images because they convey the most about the local geometry around the image. Therefore, we propose creating Independent Partial Scene Reconstructions, containing an image and its overlapping images, instead of a Global Full Scene Reconstruction. This framework provides certain distinct advantages over the previous system:We avoid sensitivity to the choice of an initial pair as one wrong choice of initial pair in one of the reconstructions will not affect the correctness of other partial reconstructions. For a similar reason we are also able avoid the cascading and compounding of errors leading to misestimating of large sections of the scene. Our framework puts a limit on the number of images involved in a partial reconstruction and thereby putting a bound on the time taken in creating a partial reconstruction. This makes our system linear in the number of images and thereby allowing it to scale to large datasets. The use of partial reconstructions also makes it easy for our system to handle new images as they become available.One drawback of this approach is that is restricts the possible number of images to which a user can transition from a given image, to only its immediate neighbors, as we shall see this is compensated by reduction in the time complexity.
  23. To create the partial reconstructions : We start by finding for each image a set of top ‘n’ similar images using Image retrieval employing Bag of visual words representation. Next, we refine these matches using Epipolar geometry contraints. Next, we do a partial scene reconstruction corresponding to each image and its verified neighbors. We use Bundler as SfM pipeline for obtaining these partial reconstructions.
  24. Our visualization interface allows browsing through the partial reconstructions created from the given input images in a interactive 3D virtual environment. We start with the partial reconstruction corresponding to one of the images, say I. Next, we align the virtual camera with the camera corresponding to image I and project image I on a planar approximation of the points visible to the camera. Similarly, we display the other images in the partial reconstruction as wireframes of their respective projections. Upon clicking a wireframe, we make a smooth transition from the current image to the new image. And at the end of the transition we move to the partial reconstruction corresponding to the new image and align the virtual camera to the camera corresponding to the new image. Therefore, the user is explores the scene by moving from one partial reconstruction to another while getting cues about the relationships that exist between an image and its overlapping images.
  25. Our system makes it very easy to incorporate a new image into a existing reconstruction. For this we determine the top n matches of the new image by comparing the histogram representation of the new image to that of the existing images. Next we verify these matches by computing their epipolar geometry. Then the new image along with it verified matches is sent to Bundler for a creating partial scene reconstruction corresponding to the new image. Thus, we potentially add a new partial reconstruction for every image that is inserted into the system.
  26. We have tested our system on a large image collection of nearly 6000 images of Golkonda Fort at Hyderabad. These images capture the fort in various illumination conditions and from several viewpoints.
  27. We created a few subsets of various sizes from this dataset. Our system scales to large datasets of up to 6k images in nearly 5 days of CPU computation time.
  28. This graph compares the time taken by our system to compute all the partial reconstructions with time taken by Bundler in doing a SfM to provide a Full scene reconstruction for datasets of various sizes. We see that the speed up provided by our system increases as with the number of images as our system takes nearly linear time in the number of images.
  29. In a experiment with an image collection of 687 images of a courtyard of the Fort collected over different times of the day, we initialized with 200 images and incrementally added the 487. The final image connectivity graph had 674 images. This following walkthrough was created from the same.
  30. As is.
  31. Thank you very much for your kind attention.
  32. Thank you very much for your kind attention.
  33. Till now we have been considering all the match graphs as directed even when each of the match was commutative. This was done to ensure that only a finite number of images are involved in every partial scene reconstruction. But we can use this commutativity of the matches during the visualization to improve connectivity. For this, we note that, an edge from A to B in the visualization graph essentially means that it is possible to make a transition from Image A to Image B in a partial reconstruction corresponding to image A, denoted by P(A). If an edge from B to A is not present then we can still show a transition from B to A in P(B) by using P(A) as a proxy partial reconstruction. For this we align the virtual camera at Image B in P(A) and make a transition from B to A. Thus, our final connectivity graph used for navigation is undirected.