SlideShare una empresa de Scribd logo
1 de 15
CERTH/CEA LIST at MediaEval Placing Task 2015
Giorgos Kordopatis-Zilos1, Adrian Popescu2, Symeon Papadopoulos1 and
Yiannis Kompatsiaris1
1 Information Technologies Institute (ITI), CERTH, Greece
2 CEA LIST, 91190 Gif-sur-Yvette, France
MediaEval 2015 Workshop, Sept. 14-15, 2015, Wurzen, Germany
Summary
#2
Tag-based location estimation (2 runs)
• Based on a geographic Language Model
• Built upon the scheme of our 2014 participation [2] (Kordopatis-Zilos et
al., MediaEval 2014)
• Extensions from [3]: improved feature selection and weighting
(Kordopatis-Zilos et al., PAISI 2015)
Visual-based location estimation (1 run)
• Geospatial clustering scheme of the most visually similar images
Hybrid location estimation (2 run)
• Combination of the textual and visual approaches
Training sets
• Training set released by the organisers (≈4.7M geotagged items)
• YFCC dataset, excl. images from users in test set (≈40M geotagged items)
Tag-based location estimation
#3
• Processing steps of the approach
– Offline: language model construction
– Online: location estimation
Language Model (LM)
• LM generation scheme
– divide earth surface in rectangular cells with a side length of 0.01°
– calculate tag-cell probabilities based on the users that used the tag inside the cell
• LM-based estimation
– the probability of each cell is calculated from the summation of the respective
tag-cell probabilities
– Most Likely Cell (MLC) considered the cell with the highest probability and used
to produce the estimation
Inspired from [4]: (Popescu, MediaEval 2013)
#4
Feature Selection and Weighting
Feature Selection
• The final tag set 𝑇 is the intersection of the two tag sets
𝑇 = 𝑇𝑎 ∩ 𝑇𝑙
Feature Weighting
• Locality weight function, sort tags in 𝑇 based on their locality score
𝑤𝑙 =
𝑇 − (𝑗 − 1)
|𝑇|
• Normalize the weights from the Spatial Entropy (SE) function
𝑤𝑠𝑒 = 𝑁(𝑒(𝑡), 𝜇, 𝜎) max
𝑡∈𝑇
(𝑁(𝑒(𝑡), 𝜇, 𝜎))
• Combine the two weighting functions
𝑤 = 𝜔 ∗ 𝑤𝑠𝑒 + (1 − 𝜔) ∗ 𝑤𝑙
#5
accuracy locality
Accuracy
• Partition training set into p folds (p = 10)
• Keep one partition at a time, and build LM with
the rest p − 1
• Estimate the location of every item of the
withheld partition
• Accuracy score of every tag
tgeo 𝑡 =
𝑁𝑟
𝑁𝑡
𝑁𝑟: correctly geotagged items
𝑁𝑡: total items tagged with 𝑡
• Tags with non-zero accuracy score form the tag
set 𝑇𝑎
From [3]: Kordopatis-Zilos et al., PAISI 2015
#6
Estimated
Locations
Locality
#7
• Captures the spatial awareness of tags
• When a user uses a tag, he/she is assigned to the respective location cell
• Each cell has a set of users assigned to it
• All users assigned to the same cell are considered neighbours
• Locality score of every tag
loc 𝑡 = 𝑁𝑡 ∗
𝑐∈𝐶 𝑢∈𝑈𝑡,𝑐
|{𝑢′|𝑢′
∈ 𝑈𝑡,𝑐, 𝑢′ ≠ 𝑢}|
𝑁𝑡
2
𝑁𝑡: total occurrences of 𝑡
𝐶 : set of all cells
𝑈𝑡,𝑐: set of users that used tag 𝑡 inside cell c
• Tags with non-zero locality score form the tag set 𝑇𝑙
Locality – value distribution
#8
london (6975), paris (5452), nyc (3917)
luminancehdr (0.0035), dsc6362 (0.003), air photo (0.002)
Extensions
• Spatial Entropy (SE) function
– calculate entropy values applying the Shannon entropy formula in the tag-cell
probabilities
– build a Gaussian weight function based on the values of the tag SE
#9
• Internal Grid
– Built an additional LM using a finer grid, cell side length of 0.001°
– combine the MLC of the individual language models
• Similarity search [6] (Van Laere et al., ICMR 2011)
– determine 𝑘 most similar training images in the MLC
– their center-of-gravity is the final location estimation
From [2]: (Kordopatis-Zilos et al., MediaEval 2014)
Visual-based location estimation
#10
Model building
• CNN features adapted by fine-tuning the VGG model [5] (Simonyan & Zisserman,
ICLR 2015)
• Training: ~1K Points Of Interest (POIs), ~1200 images/POI
• Caffe [1] (Jia et al., arxiv 2014) is fed directly with the CNN features
• Compressed outputs of fc7 layer (4096d) to 128d using PCA
• CNN features used to compute image similarities 𝑠 𝑣𝑖𝑠,𝑖𝑗
Location Estimation
• Geospatial clustering of 𝑘 = 20 visually most similar images
• If 𝑗-th image is within 1km from the closest one of the previous j − 1 images, it is
assigned to its cluster, otherwise it forms its own cluster
• The largest cluster (or the first in case of equal size) is selected and its centroid is
used as the location estimate
Hybrid-based location estimation
Model building
• Combination of the textual and visual approaches
• Build LM model using the tag-based approach above and use it for MLC selection
Similarity Calculation
• Combination of the visual and textual similarities.
• Normalize the visual similarities to the range [0, 1]
• Similarity between two images
𝑠𝑖𝑗 =
𝑠𝑡𝑒𝑥,𝑖𝑗 + 𝑠 𝑣𝑖𝑠,𝑖𝑗
2
• The final estimation is the center-of-gravity of the 𝑘 = 5 most similar images
Low Confidence Estimations
• For those test images, with no estimate or confidence lower than 0.02 (≈10% of
the test set), the visual approach is used to produce the estimated locations
#11
Confidence
• Evaluate the confidence of the LM estimation of each query image
• Measures how localized are the language model cell estimations, based on
cell probabilities
• Confidence measure
conf 𝑖 =
𝑐∈𝐶{𝑝 𝑐 𝑖 |dist 𝑐, mlc < 𝑙}
𝑐∈𝐶 𝑝 𝑐 𝑖
𝑝(𝑐|𝑖): cell probability of cell c for image 𝑖
𝑑𝑖𝑠𝑡(𝑐1, 𝑐2): distance between 𝑐1 and 𝑐2
mlc: Most Likely Cell
#12
Runs and Results
#13
measure RUN-1 RUN-2 RUN-3 RUN-4 RUN-5
acc(1m) 0.15 0.01 0.15 0.16 0.16
acc(10m) 0.61 0.08 0.62 0.75 0.76
acc(100m) 6.40 1.76 6.52 7.73 7.83
acc(1km) 24.33 5.19 24.61 27.30 27.54
acc(10km) 43.07 7.43 43.41 46.48 46.77
m. error (km) 69 5663 61 24 22
RUN-1: Tag-based location estimation + released training set
RUN-2: Visual-based location estimation + released training set
RUN-3: Hybrid location estimation + released training set
RUN-4: Tag-based location estimation + YFCC dataset
RUN-5: Hybrid location estimation + YFCC dataset
Thank you!
• Code:
https://github.com/MKLab-ITI/multimedia-geotagging
• Get in touch:
@sympapadopoulos / papadop@iti.gr
@georgekordopatis / georgekordopatis@iti.gr
#14
References
#15
[1] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama,
and T. Darrell. Caffe: Convolutional architecture for fast feature embedding.
arXiv preprint arXiv:1408.5093, 2014.
[2] G. Kordopatis-Zilos, G. Orfanidis, S. Papadopoulos, and Y. Kompatsiaris.
Socialsensor at mediaeval placing task 2014. In MediaEval 2014 Placing Task,
2014.
[3] G. Kordopatis-Zilos, S. Papadopoulos, and Y. Kompatsiaris. Geotagging social
media content with a refined language modelling approach. In Intelligence and
Security Informatics, pages 21–40, 2015.
[4] A. Popescu. CEA LIST's participation at mediaeval 2013 placing task. In
MediaEval 2013 Placing Task, 2013.
[5] K. Simonyan and A. Zisserman. Very deep convolutional networks for large-
scale image recognition. In International Conference on Learning
Representations, 2015.
[6] O. Van Laere, S. Schockaert, and B. Dhoedt. Finding locations of Flickr resources
using language models and similarity search. ICMR ’11, pages 48:1–48:8, New
York, NY, USA, 2011. ACM.

Más contenido relacionado

La actualidad más candente

Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Sunando Sengupta
 
unrban-building-damage-detection-by-PJLi.ppt
unrban-building-damage-detection-by-PJLi.pptunrban-building-damage-detection-by-PJLi.ppt
unrban-building-damage-detection-by-PJLi.ppt
grssieee
 
Visual Object Analysis using Regions and Local Features
Visual Object Analysis using Regions and Local FeaturesVisual Object Analysis using Regions and Local Features
Visual Object Analysis using Regions and Local Features
Universitat Politècnica de Catalunya
 
Project presentation
Project presentationProject presentation
Project presentation
Maham Sajid
 

La actualidad más candente (9)

Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
 
CSTalks - Object detection and tracking - 25th May
CSTalks - Object detection and tracking - 25th MayCSTalks - Object detection and tracking - 25th May
CSTalks - Object detection and tracking - 25th May
 
unrban-building-damage-detection-by-PJLi.ppt
unrban-building-damage-detection-by-PJLi.pptunrban-building-damage-detection-by-PJLi.ppt
unrban-building-damage-detection-by-PJLi.ppt
 
Visual Object Analysis using Regions and Local Features
Visual Object Analysis using Regions and Local FeaturesVisual Object Analysis using Regions and Local Features
Visual Object Analysis using Regions and Local Features
 
Image formation
Image formationImage formation
Image formation
 
Hyougo iv2014 slide
Hyougo iv2014 slideHyougo iv2014 slide
Hyougo iv2014 slide
 
Project presentation
Project presentationProject presentation
Project presentation
 
Henrik Christensen - Vision for co-robot applications
Henrik Christensen  -  Vision for co-robot applicationsHenrik Christensen  -  Vision for co-robot applications
Henrik Christensen - Vision for co-robot applications
 
Simultaneous Localization and Mapping for Pedestrians using Distortions of th...
Simultaneous Localization and Mapping for Pedestrians using Distortions of th...Simultaneous Localization and Mapping for Pedestrians using Distortions of th...
Simultaneous Localization and Mapping for Pedestrians using Distortions of th...
 

Destacado

How to write a good newspaper article
How to write a good newspaper articleHow to write a good newspaper article
How to write a good newspaper article
Yevgeniya Grigoryeva
 

Destacado (18)

O absolutismo europeu
O absolutismo europeuO absolutismo europeu
O absolutismo europeu
 
50 terrifying facts about UK personal finance
50 terrifying facts about UK personal finance50 terrifying facts about UK personal finance
50 terrifying facts about UK personal finance
 
Alternatives to power point
Alternatives to power pointAlternatives to power point
Alternatives to power point
 
Macrosolutions Training: Project Communications Management
Macrosolutions Training: Project Communications ManagementMacrosolutions Training: Project Communications Management
Macrosolutions Training: Project Communications Management
 
Travel blog presentation
Travel blog presentationTravel blog presentation
Travel blog presentation
 
Predicting News Popularity by Mining Online Discussions
Predicting News Popularity by Mining Online DiscussionsPredicting News Popularity by Mining Online Discussions
Predicting News Popularity by Mining Online Discussions
 
Frictionless Bicycle Dynamo
Frictionless Bicycle DynamoFrictionless Bicycle Dynamo
Frictionless Bicycle Dynamo
 
Parcerias público-privada PPP
Parcerias público-privada PPP Parcerias público-privada PPP
Parcerias público-privada PPP
 
IR Based Home Automation
IR Based Home AutomationIR Based Home Automation
IR Based Home Automation
 
A República Populista
A República PopulistaA República Populista
A República Populista
 
Natural Enviroment
Natural EnviromentNatural Enviroment
Natural Enviroment
 
Auguste comte e o positivismo 2
Auguste comte e o positivismo 2Auguste comte e o positivismo 2
Auguste comte e o positivismo 2
 
Solar Irrigation Pumps in India: Can Electicity Buy-Back Curb Groundwater Ove...
Solar Irrigation Pumps in India: Can Electicity Buy-Back Curb Groundwater Ove...Solar Irrigation Pumps in India: Can Electicity Buy-Back Curb Groundwater Ove...
Solar Irrigation Pumps in India: Can Electicity Buy-Back Curb Groundwater Ove...
 
SIMULATION OF TEMPERATURE SENSOR USING LABVIEW
SIMULATION OF TEMPERATURE SENSOR USING LABVIEWSIMULATION OF TEMPERATURE SENSOR USING LABVIEW
SIMULATION OF TEMPERATURE SENSOR USING LABVIEW
 
Sociologia introdução fundamentos e bases
Sociologia introdução fundamentos e basesSociologia introdução fundamentos e bases
Sociologia introdução fundamentos e bases
 
Hv ppt
Hv pptHv ppt
Hv ppt
 
How to write a good newspaper article
How to write a good newspaper articleHow to write a good newspaper article
How to write a good newspaper article
 
5 Key Chart Project Management (TM) Methodology
5 Key Chart Project Management (TM) Methodology5 Key Chart Project Management (TM) Methodology
5 Key Chart Project Management (TM) Methodology
 

Similar a CERTH/CEA LIST at MediaEval Placing Task 2015

PCA and Classification
PCA and ClassificationPCA and Classification
PCA and Classification
Fatwa Ramdani
 
EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171
Yaxin Liu
 
“Person Re-Identification and Tracking at the Edge: Challenges and Techniques...
“Person Re-Identification and Tracking at the Edge: Challenges and Techniques...“Person Re-Identification and Tracking at the Edge: Challenges and Techniques...
“Person Re-Identification and Tracking at the Edge: Challenges and Techniques...
Edge AI and Vision Alliance
 

Similar a CERTH/CEA LIST at MediaEval Placing Task 2015 (20)

MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
MediaEval 2016 - Placing Images with Refined Language Models and Similarity S...
 
Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...
 
MediaEval 2015 - CERTH/CEA LIST at MediaEval Placing Task 2015
MediaEval 2015 - CERTH/CEA LIST at MediaEval Placing Task 2015MediaEval 2015 - CERTH/CEA LIST at MediaEval Placing Task 2015
MediaEval 2015 - CERTH/CEA LIST at MediaEval Placing Task 2015
 
Geotagging Social Media Content with a Refined Language Modelling Approach
Geotagging Social Media Content with a Refined Language Modelling ApproachGeotagging Social Media Content with a Refined Language Modelling Approach
Geotagging Social Media Content with a Refined Language Modelling Approach
 
Geotagging Social Media Content with a Refined Language Modelling Approach
Geotagging Social Media Content with a Refined Language Modelling ApproachGeotagging Social Media Content with a Refined Language Modelling Approach
Geotagging Social Media Content with a Refined Language Modelling Approach
 
A ROS IMPLEMENTATION OF THE MONO-SLAM ALGORITHM
A ROS IMPLEMENTATION OF THE MONO-SLAM ALGORITHMA ROS IMPLEMENTATION OF THE MONO-SLAM ALGORITHM
A ROS IMPLEMENTATION OF THE MONO-SLAM ALGORITHM
 
PCA and Classification
PCA and ClassificationPCA and Classification
PCA and Classification
 
Human action recognition with kinect using a joint motion descriptor
Human action recognition with kinect using a joint motion descriptorHuman action recognition with kinect using a joint motion descriptor
Human action recognition with kinect using a joint motion descriptor
 
Improved nonlocal means based on pre classification and invariant block matching
Improved nonlocal means based on pre classification and invariant block matchingImproved nonlocal means based on pre classification and invariant block matching
Improved nonlocal means based on pre classification and invariant block matching
 
Improved nonlocal means based on pre classification and invariant block matching
Improved nonlocal means based on pre classification and invariant block matchingImproved nonlocal means based on pre classification and invariant block matching
Improved nonlocal means based on pre classification and invariant block matching
 
Face recognition v1
Face recognition v1Face recognition v1
Face recognition v1
 
IRJET- Digital Image Forgery Detection using Local Binary Patterns (LBP) and ...
IRJET- Digital Image Forgery Detection using Local Binary Patterns (LBP) and ...IRJET- Digital Image Forgery Detection using Local Binary Patterns (LBP) and ...
IRJET- Digital Image Forgery Detection using Local Binary Patterns (LBP) and ...
 
Human Pose Estimation by Deep Learning
Human Pose Estimation by Deep LearningHuman Pose Estimation by Deep Learning
Human Pose Estimation by Deep Learning
 
Video Manifold Feature Extraction Based on ISOMAP
Video Manifold Feature Extraction Based on ISOMAPVideo Manifold Feature Extraction Based on ISOMAP
Video Manifold Feature Extraction Based on ISOMAP
 
Towards Accurate Multi-person Pose Estimation in the Wild (My summery)
Towards Accurate Multi-person Pose Estimation in the Wild (My summery)Towards Accurate Multi-person Pose Estimation in the Wild (My summery)
Towards Accurate Multi-person Pose Estimation in the Wild (My summery)
 
Feature extraction based retrieval of
Feature extraction based retrieval ofFeature extraction based retrieval of
Feature extraction based retrieval of
 
EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171EE660_Report_YaxinLiu_8448347171
EE660_Report_YaxinLiu_8448347171
 
Data-Driven Motion Estimation With Spatial Adaptation
Data-Driven Motion Estimation With Spatial AdaptationData-Driven Motion Estimation With Spatial Adaptation
Data-Driven Motion Estimation With Spatial Adaptation
 
NetVLAD: CNN architecture for weakly supervised place recognition
NetVLAD:  CNN architecture for weakly supervised place recognitionNetVLAD:  CNN architecture for weakly supervised place recognition
NetVLAD: CNN architecture for weakly supervised place recognition
 
“Person Re-Identification and Tracking at the Edge: Challenges and Techniques...
“Person Re-Identification and Tracking at the Edge: Challenges and Techniques...“Person Re-Identification and Tracking at the Edge: Challenges and Techniques...
“Person Re-Identification and Tracking at the Edge: Challenges and Techniques...
 

Más de Symeon Papadopoulos

Más de Symeon Papadopoulos (20)

DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno...
DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno...DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno...
DeepFake Detection: Challenges, Progress and Hands-on Demonstration of Techno...
 
Deepfakes: An Emerging Internet Threat and their Detection
Deepfakes: An Emerging Internet Threat and their DetectionDeepfakes: An Emerging Internet Threat and their Detection
Deepfakes: An Emerging Internet Threat and their Detection
 
Knowledge-based Fusion for Image Tampering Localization
Knowledge-based Fusion for Image Tampering LocalizationKnowledge-based Fusion for Image Tampering Localization
Knowledge-based Fusion for Image Tampering Localization
 
Deepfake Detection: The Importance of Training Data Preprocessing and Practic...
Deepfake Detection: The Importance of Training Data Preprocessing and Practic...Deepfake Detection: The Importance of Training Data Preprocessing and Practic...
Deepfake Detection: The Importance of Training Data Preprocessing and Practic...
 
COVID-19 Infodemic vs Contact Tracing
COVID-19 Infodemic vs Contact TracingCOVID-19 Infodemic vs Contact Tracing
COVID-19 Infodemic vs Contact Tracing
 
Similarity-based retrieval of multimedia content
Similarity-based retrieval of multimedia contentSimilarity-based retrieval of multimedia content
Similarity-based retrieval of multimedia content
 
Twitter-based Sensing of City-level Air Quality
Twitter-based Sensing of City-level Air QualityTwitter-based Sensing of City-level Air Quality
Twitter-based Sensing of City-level Air Quality
 
Aggregating and Analyzing the Context of Social Media Content
Aggregating and Analyzing the Context of Social Media ContentAggregating and Analyzing the Context of Social Media Content
Aggregating and Analyzing the Context of Social Media Content
 
Verifying Multimedia Content on the Internet
Verifying Multimedia Content on the InternetVerifying Multimedia Content on the Internet
Verifying Multimedia Content on the Internet
 
A Web-based Service for Image Tampering Detection
A Web-based Service for Image Tampering DetectionA Web-based Service for Image Tampering Detection
A Web-based Service for Image Tampering Detection
 
Learning to detect Misleading Content on Twitter
Learning to detect Misleading Content on TwitterLearning to detect Misleading Content on Twitter
Learning to detect Misleading Content on Twitter
 
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN LayersNear-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
Near-Duplicate Video Retrieval by Aggregating Intermediate CNN Layers
 
Verifying Multimedia Use at MediaEval 2016
Verifying Multimedia Use at MediaEval 2016Verifying Multimedia Use at MediaEval 2016
Verifying Multimedia Use at MediaEval 2016
 
Multimedia Privacy
Multimedia PrivacyMultimedia Privacy
Multimedia Privacy
 
In-depth Exploration of Geotagging Performance
In-depth Exploration of Geotagging PerformanceIn-depth Exploration of Geotagging Performance
In-depth Exploration of Geotagging Performance
 
Perceived versus Actual Predictability of Personal Information in Social Netw...
Perceived versus Actual Predictability of Personal Information in Social Netw...Perceived versus Actual Predictability of Personal Information in Social Netw...
Perceived versus Actual Predictability of Personal Information in Social Netw...
 
Web and Social Media Image Forensics for News Professionals
Web and Social Media Image Forensics for News ProfessionalsWeb and Social Media Image Forensics for News Professionals
Web and Social Media Image Forensics for News Professionals
 
Verifying Multimedia Use at MediaEval 2015
Verifying Multimedia Use at MediaEval 2015Verifying Multimedia Use at MediaEval 2015
Verifying Multimedia Use at MediaEval 2015
 
Detecting image splicing in the wild Web
Detecting image splicing in the wild WebDetecting image splicing in the wild Web
Detecting image splicing in the wild Web
 
Learning to Classify Users in Online Interaction Networks
Learning to Classify Users in Online Interaction NetworksLearning to Classify Users in Online Interaction Networks
Learning to Classify Users in Online Interaction Networks
 

Último

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 

Último (20)

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 

CERTH/CEA LIST at MediaEval Placing Task 2015

  • 1. CERTH/CEA LIST at MediaEval Placing Task 2015 Giorgos Kordopatis-Zilos1, Adrian Popescu2, Symeon Papadopoulos1 and Yiannis Kompatsiaris1 1 Information Technologies Institute (ITI), CERTH, Greece 2 CEA LIST, 91190 Gif-sur-Yvette, France MediaEval 2015 Workshop, Sept. 14-15, 2015, Wurzen, Germany
  • 2. Summary #2 Tag-based location estimation (2 runs) • Based on a geographic Language Model • Built upon the scheme of our 2014 participation [2] (Kordopatis-Zilos et al., MediaEval 2014) • Extensions from [3]: improved feature selection and weighting (Kordopatis-Zilos et al., PAISI 2015) Visual-based location estimation (1 run) • Geospatial clustering scheme of the most visually similar images Hybrid location estimation (2 run) • Combination of the textual and visual approaches Training sets • Training set released by the organisers (≈4.7M geotagged items) • YFCC dataset, excl. images from users in test set (≈40M geotagged items)
  • 3. Tag-based location estimation #3 • Processing steps of the approach – Offline: language model construction – Online: location estimation
  • 4. Language Model (LM) • LM generation scheme – divide earth surface in rectangular cells with a side length of 0.01° – calculate tag-cell probabilities based on the users that used the tag inside the cell • LM-based estimation – the probability of each cell is calculated from the summation of the respective tag-cell probabilities – Most Likely Cell (MLC) considered the cell with the highest probability and used to produce the estimation Inspired from [4]: (Popescu, MediaEval 2013) #4
  • 5. Feature Selection and Weighting Feature Selection • The final tag set 𝑇 is the intersection of the two tag sets 𝑇 = 𝑇𝑎 ∩ 𝑇𝑙 Feature Weighting • Locality weight function, sort tags in 𝑇 based on their locality score 𝑤𝑙 = 𝑇 − (𝑗 − 1) |𝑇| • Normalize the weights from the Spatial Entropy (SE) function 𝑤𝑠𝑒 = 𝑁(𝑒(𝑡), 𝜇, 𝜎) max 𝑡∈𝑇 (𝑁(𝑒(𝑡), 𝜇, 𝜎)) • Combine the two weighting functions 𝑤 = 𝜔 ∗ 𝑤𝑠𝑒 + (1 − 𝜔) ∗ 𝑤𝑙 #5 accuracy locality
  • 6. Accuracy • Partition training set into p folds (p = 10) • Keep one partition at a time, and build LM with the rest p − 1 • Estimate the location of every item of the withheld partition • Accuracy score of every tag tgeo 𝑡 = 𝑁𝑟 𝑁𝑡 𝑁𝑟: correctly geotagged items 𝑁𝑡: total items tagged with 𝑡 • Tags with non-zero accuracy score form the tag set 𝑇𝑎 From [3]: Kordopatis-Zilos et al., PAISI 2015 #6 Estimated Locations
  • 7. Locality #7 • Captures the spatial awareness of tags • When a user uses a tag, he/she is assigned to the respective location cell • Each cell has a set of users assigned to it • All users assigned to the same cell are considered neighbours • Locality score of every tag loc 𝑡 = 𝑁𝑡 ∗ 𝑐∈𝐶 𝑢∈𝑈𝑡,𝑐 |{𝑢′|𝑢′ ∈ 𝑈𝑡,𝑐, 𝑢′ ≠ 𝑢}| 𝑁𝑡 2 𝑁𝑡: total occurrences of 𝑡 𝐶 : set of all cells 𝑈𝑡,𝑐: set of users that used tag 𝑡 inside cell c • Tags with non-zero locality score form the tag set 𝑇𝑙
  • 8. Locality – value distribution #8 london (6975), paris (5452), nyc (3917) luminancehdr (0.0035), dsc6362 (0.003), air photo (0.002)
  • 9. Extensions • Spatial Entropy (SE) function – calculate entropy values applying the Shannon entropy formula in the tag-cell probabilities – build a Gaussian weight function based on the values of the tag SE #9 • Internal Grid – Built an additional LM using a finer grid, cell side length of 0.001° – combine the MLC of the individual language models • Similarity search [6] (Van Laere et al., ICMR 2011) – determine 𝑘 most similar training images in the MLC – their center-of-gravity is the final location estimation From [2]: (Kordopatis-Zilos et al., MediaEval 2014)
  • 10. Visual-based location estimation #10 Model building • CNN features adapted by fine-tuning the VGG model [5] (Simonyan & Zisserman, ICLR 2015) • Training: ~1K Points Of Interest (POIs), ~1200 images/POI • Caffe [1] (Jia et al., arxiv 2014) is fed directly with the CNN features • Compressed outputs of fc7 layer (4096d) to 128d using PCA • CNN features used to compute image similarities 𝑠 𝑣𝑖𝑠,𝑖𝑗 Location Estimation • Geospatial clustering of 𝑘 = 20 visually most similar images • If 𝑗-th image is within 1km from the closest one of the previous j − 1 images, it is assigned to its cluster, otherwise it forms its own cluster • The largest cluster (or the first in case of equal size) is selected and its centroid is used as the location estimate
  • 11. Hybrid-based location estimation Model building • Combination of the textual and visual approaches • Build LM model using the tag-based approach above and use it for MLC selection Similarity Calculation • Combination of the visual and textual similarities. • Normalize the visual similarities to the range [0, 1] • Similarity between two images 𝑠𝑖𝑗 = 𝑠𝑡𝑒𝑥,𝑖𝑗 + 𝑠 𝑣𝑖𝑠,𝑖𝑗 2 • The final estimation is the center-of-gravity of the 𝑘 = 5 most similar images Low Confidence Estimations • For those test images, with no estimate or confidence lower than 0.02 (≈10% of the test set), the visual approach is used to produce the estimated locations #11
  • 12. Confidence • Evaluate the confidence of the LM estimation of each query image • Measures how localized are the language model cell estimations, based on cell probabilities • Confidence measure conf 𝑖 = 𝑐∈𝐶{𝑝 𝑐 𝑖 |dist 𝑐, mlc < 𝑙} 𝑐∈𝐶 𝑝 𝑐 𝑖 𝑝(𝑐|𝑖): cell probability of cell c for image 𝑖 𝑑𝑖𝑠𝑡(𝑐1, 𝑐2): distance between 𝑐1 and 𝑐2 mlc: Most Likely Cell #12
  • 13. Runs and Results #13 measure RUN-1 RUN-2 RUN-3 RUN-4 RUN-5 acc(1m) 0.15 0.01 0.15 0.16 0.16 acc(10m) 0.61 0.08 0.62 0.75 0.76 acc(100m) 6.40 1.76 6.52 7.73 7.83 acc(1km) 24.33 5.19 24.61 27.30 27.54 acc(10km) 43.07 7.43 43.41 46.48 46.77 m. error (km) 69 5663 61 24 22 RUN-1: Tag-based location estimation + released training set RUN-2: Visual-based location estimation + released training set RUN-3: Hybrid location estimation + released training set RUN-4: Tag-based location estimation + YFCC dataset RUN-5: Hybrid location estimation + YFCC dataset
  • 14. Thank you! • Code: https://github.com/MKLab-ITI/multimedia-geotagging • Get in touch: @sympapadopoulos / papadop@iti.gr @georgekordopatis / georgekordopatis@iti.gr #14
  • 15. References #15 [1] Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell. Caffe: Convolutional architecture for fast feature embedding. arXiv preprint arXiv:1408.5093, 2014. [2] G. Kordopatis-Zilos, G. Orfanidis, S. Papadopoulos, and Y. Kompatsiaris. Socialsensor at mediaeval placing task 2014. In MediaEval 2014 Placing Task, 2014. [3] G. Kordopatis-Zilos, S. Papadopoulos, and Y. Kompatsiaris. Geotagging social media content with a refined language modelling approach. In Intelligence and Security Informatics, pages 21–40, 2015. [4] A. Popescu. CEA LIST's participation at mediaeval 2013 placing task. In MediaEval 2013 Placing Task, 2013. [5] K. Simonyan and A. Zisserman. Very deep convolutional networks for large- scale image recognition. In International Conference on Learning Representations, 2015. [6] O. Van Laere, S. Schockaert, and B. Dhoedt. Finding locations of Flickr resources using language models and similarity search. ICMR ’11, pages 48:1–48:8, New York, NY, USA, 2011. ACM.

Notas del editor

  1. Different kinds of user classification: topic-oriented (e.g., interest/expertise) role-based/behavioral (e.g., bot/spammer) geographical location Useful for advertising, user recommendation, expert search, etc. For personal accounts, user classification raises privacy concerns Challenges multi-linguality Brevity informal language