SlideShare una empresa de Scribd logo
1 de 36
Descargar para leer sin conexión
Rich Internet Application
for Semi-Automatic Annotation
of Semantic Shots on Keyframes
Elisabet Carcel, Manuel Martos,
Xavier Giró-i-Nieto, Ferran Marqués
1
1Wednesday, December 14, 11
Outline
Graphical User Interface
Motivation
Requirements
Conclusions
Semantic Shot Classifier
Design
Future work
Evaluation
Design
Video-demo
2
2Wednesday, December 14, 11
Motivation
Digital content:
Multimedia content must be annotated
to allow its future retrieval.
RetrievalIngestAV Production
Video recording
in digital format
Storage and video
indexing in a database
system
Search through the
metadata to retrieve
video content
3
3Wednesday, December 14, 11
Motivation
DIGITION
Media Asset Management
Asset
02_14_20
4
4Wednesday, December 14, 11
0.056%
99.944%
Daily retrieval in 2010
Repository Ingest
Retrieval
Motivation
0
80,000
160,000
240,000
320,000
400,000
2010 2011 2012
25,000
25,000
25,000
185,000
160,000
135,000
Stored
Ingest
Video hours per year
Ingest and Retrieval
Non-retrieved content
75h retrieved
5
5Wednesday, December 14, 11
Motivation
Current solution Proposed new approach
Metadata by strata
• Start time code
• End time code
• Content description
- Action
-Semantic shots
- People
Non-
annotated assets
Annotated assets
GUI
for semi-automatic
annotation
6
6Wednesday, December 14, 11
Outline
Graphical User Interface
Motivation
Requirements
Conclusions
Semantic Shot Classifier
Design
Future work
Evaluation
Design
Video-demo
7
7Wednesday, December 14, 11
Semantic Shot Classifier
• Annotation of the extracted keyframes
Semantic Shot
Medium Shot
+
Player
Medium Shot
+
Box
Requirements
Semantic ConceptShot Type
8
8Wednesday, December 14, 11
• President close-up
• Close-up
• Medium
• General bureau
• General chamber
• Player close-up
• Player medium
• Stadium overview
• Audience
• Banner
• Box
Requirements
Football Parlament
9
9Wednesday, December 14, 11
Graphical User Interface
• Embeddable in existing system (Digition)
• User friendly
• Domain-generic interface
Requirements
10
10Wednesday, December 14, 11
Outline
Graphical User InterfaceRequirements
Conclusions
Semantic Shot Classifier
Design
Future work
Evaluation
Design
Video-demo
Motivation
11
11Wednesday, December 14, 11
Supervised Machine Learning
Design
Training
Detection
=
????
0.23
0.98
>Thr
Decision
Yes
No
?
?
?
?
12
12Wednesday, December 14, 11
System Architecture
Training:
Test:
>thr
Decision
Yes
No
MAX
confidence
?
Confidence
1
Confidence
2
Confidence
N
Classifier 2
Classifier N
Classifier 1
Domain
model
MAX class
Unknown
SVM Trainer
Model
class 1
Model
class 2
Model
class N
Design
SVM Trainer
SVM Trainer
13
13Wednesday, December 14, 11
Annotation
Football datasetFootball datasetFootball dataset
Class 1 Medium 116
Class 2 Closer Up 87
Class 3 Stadium 4
Class 4 Audience 19
Class 5 Banner 4
Class 6 Box 6
Class 7 Unknown 278
514
Parliament datasetParliament datasetParliament dataset
Class 1 President CU 16
Class 2 Close-up 63
Class 3 Medium 68
Class 4 Bureau 13
Class 5 Chamber 23
Class 6 Unknown 84
276
Design
GAT annotation tool: http://upseek.upc.edu/gat
14
14Wednesday, December 14, 11
Trainer parameters
Unsupervised clustering to support multi-view:
-MIN # elements / cluster
-MAX cluster radius
Design
Detector parameters
Confidence threshold
>Thr.
Decision
Yes
No
Confidence Unknown
15
15Wednesday, December 14, 11
Outline
Graphical User InterfaceRequirements
Conclusions
Semantic Shot Classifier
Design
Future work
Evaluation
Design
Video-demo
Motivation
16
16Wednesday, December 14, 11
Quality measure: F1-score
Evaluation
Class 1 Class 2 Class 3
Class1
Class 2
Class 3
4 1 0
0 3 0
2 0 3
Automatic
M
a
n
u
a
l
17
17Wednesday, December 14, 11
Quality measure: F1-score
Evaluation
Class 1 Class 2 Class 3
Class 1
Class 2
Class 3
4 1 0
0 3 0
2 0 3
Automatic
M
a
n
u
a
l
18
18Wednesday, December 14, 11
Cross Validation with Random Partition
• Training
- 80% of positive instances (+)
- Positive instances from other classes are
considered negative (-)
Evaluation
• Detection
- 20% of the positive instances (+)
19
19Wednesday, December 14, 11
3 parameters to tune
the classifier:
• MAX radius for a multi-view cluster
• MIN elements for a multi-view cluster
• Confidence threshold
Threshold (0.0 - 1.0)
Evaluation
Cross-validation
iterations (3)
Max radius
(0.1 - 1.0)
Min elements
(0 - 5)
F1 measure
20
20Wednesday, December 14, 11
Results for Parliament domain
• Optimal parameters
– Threshold: 0.9
– MIN # elements: 3
– MAX radius: 0.8
Min Max radius
elements 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
1 0.718 0.962 0.952 0.949 0.958 0.970 0.957 0.957 0.949 0.946
2 0.715 0.962 0.955 0.952 0.952 0.962 0.951 0.951 0.966 0.965
3 0.769 0.948 0.968 0.960 0.960 0.957 0.959 0.972 0.944 0.958
4 0.763 0.955 0.965 0.962 0.956 0.963 0.968 0.966 0.954 0.964
5 0.719 0.970 0.966 0.947 0.958 0.959 0.958 0.961 0.953 0.967
Figure 7.4.: Catalan Parliament F1 measures for 0.9 minimum score
Evaluation
21
21Wednesday, December 14, 11
• “Unknown” labelled as “Close up”: 7.73% error
• Semantic shots labelled as “Unknown”: 1.27% error
F1 Measure
• Close Up president: 0.98
• Close Up: 0.95
• Medium: 0.99
• General bureau: 0.98
• General chamber: 0.99
• Unknown: 0.94
Crom
o
Crom
o PP
Classe
3
CU Presi.
CU Presi.
Medium
Bureau
Chamber
Unknown
93 0 0 0 0 3
0 376 0 0 0 2
0 0 405 0 0 3
0 0 0 75 0 3
0 0 0 0 135 3
0 39 0 0 0 465
Automatic
M
a
n
u
a
l
CU
Presi
CU Med Bur Cha
Un
know
Evaluation
22
22Wednesday, December 14, 11
Results for Football domain
• Optimal parameters
– Threshold: 0.7
– MIN # elements: 1
– MAX radius: 0.8
Evaluation
Min Max radius
elements 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0
1 0.385 0.616 0.691 0.703 0.822 0.734 0.771 0.860 0.829 0.714
2 0 0 0.838 0.546 0.858 0.827 0.828 0.816 0.832 0.659
3 0 0 0.853 0.784 0.764 0.785 0.770 0.818 0.833 0.727
4 0 0 0 0 0 0 0 0 0 0
5 0 0 0 0 0 0 0 0 0 0
Figure 7.1.: Soccer match F1 measures for 0.7 minimum score
23
23Wednesday, December 14, 11
Cromo Cromo
PP
Classe
3
3
Cromo
Cromo PP
C.Beauty
Public
Pancarta
Llotja
Unknown
629 1 0 0 0 0 33
4 482 0 0 0 0 36
1 0 20 0 0 0 3
0 0 0 94 0 0 20
0 0 0 0 18 0 6
0 1 0 0 0 21 14
461 3 0 0 0 0 1204
Automatic
M
a
n
u
a
l
Cromo
Cromo
PP
Cromo
Beauty
Public Panc. Llotja
Un
know
Automatic
Class1 Class2 Class3 Class4 Class5 Class6 Class7
Class1 652 11 0 0 0 0 33
Class2 4 482 0 0 0 0 36
Class3 1 0 20 0 0 0 3
Class4 0 0 0 94 0 0 20
Class5 0 0 0 0 18 0 6
Class6 0 1 0 0 0 21 14
Class7 461 3 0 0 0 0 1204
Table 7.3.: Best soccer match confusion matrix instances
Class1 Class2 Class3 Class4 Class5 Class6 Class7
n 0.58 0.97 1.0 1.0 1.0 1.0 0.91
F1 Measure
• Close-UP: 0.72
• Medium: 0.95
• C.Beauty: 0.91
• Public: 0.90
• Banner: 0.86
• Box: 0.74
• Unknown: 0.81
Evaluation
class that causes that problem and the percentage error of the confusion.
Manual Automatic % Error
Cromo PP Cromo 0.76%
Cromo Cromo PP 1.58%
Cromo Beauty Cromo 4.16%
Llotja Cromo PP 2.77%
Cap pla Cromo 27.63%
Cap pla Cromo PP 0.17%
Totes les classes Cap pla 7.9%
Figure 7.3.: Soccer match detection errors
24
24Wednesday, December 14, 11
Outline
Graphical User InterfaceRequirements
Conclusions
Semantic Shot Classifier
Design
Future work
Evaluation
Design
Video-demo
Motivation
25
25Wednesday, December 14, 11
Design
26
26Wednesday, December 14, 11
Web services
• List of keyframes given an asset ID
• Keyframe classifier
System architecture
Data exchange
JSON Format
00_00_09_24 00_00_39_25 00_00_54_48 00_00_59_41
Asset 1257
Class 3
Score: 0.83
Class 3
Score 0.75
Class 5
Score 0.46
Class 2
Score 0.98
00_00_39_25
Football
MIN elems
MAX radius
00_00_54_48
Football
MIN elems
MAX radius
00_00_59_41
Football
MIN elems
MAX radius
00_00_09_24
Football
MIN elems
MAX radius
Design
27
27Wednesday, December 14, 11
Outline
Graphical User InterfaceRequirements
Conclusions
Semantic Shot Classifier
Design
Future work
Evaluation
Design
Video-demo
Motivation
28
28Wednesday, December 14, 11
Video-demo
29
29Wednesday, December 14, 11
Video-demo
Parliament domain: http://youtu.be/R7o0sCoe-Gs
30
30Wednesday, December 14, 11
Video-demo
Football domain: http://youtu.be/uURx9GoRJBg
31
31Wednesday, December 14, 11
Outline
Graphical User InterfaceRequirements
Conclusions
Semantic Shot Classifier
Design
Future work
Evaluation
Design
Video-demo
Motivation
32
32Wednesday, December 14, 11
• Web service to use new annotations to re-train
• Use semantic shot information to launch additional
analysis
- Synthetic and natural signs -- Text recognition
- Close ups -- Face recognition
- Medium Shot Parliament -- Speech recognition
Future work
+,



(
9	
,
		!
01
	
5
IMT

Más contenido relacionado

Similar a Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

Zipline—Airbnb’s Declarative Feature Engineering Framework
Zipline—Airbnb’s Declarative Feature Engineering FrameworkZipline—Airbnb’s Declarative Feature Engineering Framework
Zipline—Airbnb’s Declarative Feature Engineering Framework
Databricks
 
Performance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use casePerformance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use case
inovex GmbH
 
Louis hughes y1 gd engine_terminology college work
Louis hughes y1 gd engine_terminology college workLouis hughes y1 gd engine_terminology college work
Louis hughes y1 gd engine_terminology college work
LouisHughes666
 
Game Engine Terminology Glossary
Game Engine Terminology GlossaryGame Engine Terminology Glossary
Game Engine Terminology Glossary
JoshuaRidett
 
The CEO Just Called Your Boss. His MS Teams calls keep dropping! What do you do?
The CEO Just Called Your Boss. His MS Teams calls keep dropping! What do you do?The CEO Just Called Your Boss. His MS Teams calls keep dropping! What do you do?
The CEO Just Called Your Boss. His MS Teams calls keep dropping! What do you do?
panagenda
 

Similar a Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes (20)

Lean six sigma executive overview (case study) templates
Lean six sigma executive overview (case study) templatesLean six sigma executive overview (case study) templates
Lean six sigma executive overview (case study) templates
 
Lecture 3: Convolutional Neural Networks
Lecture 3: Convolutional Neural NetworksLecture 3: Convolutional Neural Networks
Lecture 3: Convolutional Neural Networks
 
Brand Analytics Management: Measuring CLV Across Platforms, Devices and Apps
Brand Analytics Management: Measuring CLV Across Platforms, Devices and AppsBrand Analytics Management: Measuring CLV Across Platforms, Devices and Apps
Brand Analytics Management: Measuring CLV Across Platforms, Devices and Apps
 
Movebot ENGR245 Lean LaunchPad Stanford 2018
Movebot ENGR245 Lean LaunchPad Stanford 2018Movebot ENGR245 Lean LaunchPad Stanford 2018
Movebot ENGR245 Lean LaunchPad Stanford 2018
 
SIG-NOC Tools survey results
SIG-NOC Tools survey resultsSIG-NOC Tools survey results
SIG-NOC Tools survey results
 
Zipline—Airbnb’s Declarative Feature Engineering Framework
Zipline—Airbnb’s Declarative Feature Engineering FrameworkZipline—Airbnb’s Declarative Feature Engineering Framework
Zipline—Airbnb’s Declarative Feature Engineering Framework
 
PASS Summit 2010 Keynote David DeWitt
PASS Summit 2010 Keynote David DeWittPASS Summit 2010 Keynote David DeWitt
PASS Summit 2010 Keynote David DeWitt
 
Graphical password minor report
Graphical password minor reportGraphical password minor report
Graphical password minor report
 
Game Analytics & Machine Learning
Game Analytics & Machine LearningGame Analytics & Machine Learning
Game Analytics & Machine Learning
 
Machine Learning AND Deep Learning for OpenPOWER
Machine Learning AND Deep Learning for OpenPOWERMachine Learning AND Deep Learning for OpenPOWER
Machine Learning AND Deep Learning for OpenPOWER
 
[Tutorial] building machine learning models for predictive maintenance applic...
[Tutorial] building machine learning models for predictive maintenance applic...[Tutorial] building machine learning models for predictive maintenance applic...
[Tutorial] building machine learning models for predictive maintenance applic...
 
Scalawox deeplearning
Scalawox deeplearningScalawox deeplearning
Scalawox deeplearning
 
Performance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use casePerformance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use case
 
Performance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use casePerformance evaluation of GANs in a semisupervised OCR use case
Performance evaluation of GANs in a semisupervised OCR use case
 
Louis hughes y1 gd engine_terminology college work
Louis hughes y1 gd engine_terminology college workLouis hughes y1 gd engine_terminology college work
Louis hughes y1 gd engine_terminology college work
 
Game Engine Terminology Glossary
Game Engine Terminology GlossaryGame Engine Terminology Glossary
Game Engine Terminology Glossary
 
B4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearningB4UConference_machine learning_deeplearning
B4UConference_machine learning_deeplearning
 
The CEO Just Called Your Boss. His MS Teams calls keep dropping! What do you do?
The CEO Just Called Your Boss. His MS Teams calls keep dropping! What do you do?The CEO Just Called Your Boss. His MS Teams calls keep dropping! What do you do?
The CEO Just Called Your Boss. His MS Teams calls keep dropping! What do you do?
 
A Video Processing System for Detection and Classification of Cricket Events
A Video Processing System for Detection and Classification of Cricket EventsA Video Processing System for Detection and Classification of Cricket Events
A Video Processing System for Detection and Classification of Cricket Events
 
Football League Management System Final Year Report
Football League Management System Final Year ReportFootball League Management System Final Year Report
Football League Management System Final Year Report
 

Más de Universitat Politècnica de Catalunya

Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Universitat Politècnica de Catalunya
 

Más de Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
 

Último

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Último (20)

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 

Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes

  • 1. Rich Internet Application for Semi-Automatic Annotation of Semantic Shots on Keyframes Elisabet Carcel, Manuel Martos, Xavier Giró-i-Nieto, Ferran Marqués 1 1Wednesday, December 14, 11
  • 2. Outline Graphical User Interface Motivation Requirements Conclusions Semantic Shot Classifier Design Future work Evaluation Design Video-demo 2 2Wednesday, December 14, 11
  • 3. Motivation Digital content: Multimedia content must be annotated to allow its future retrieval. RetrievalIngestAV Production Video recording in digital format Storage and video indexing in a database system Search through the metadata to retrieve video content 3 3Wednesday, December 14, 11
  • 5. 0.056% 99.944% Daily retrieval in 2010 Repository Ingest Retrieval Motivation 0 80,000 160,000 240,000 320,000 400,000 2010 2011 2012 25,000 25,000 25,000 185,000 160,000 135,000 Stored Ingest Video hours per year Ingest and Retrieval Non-retrieved content 75h retrieved 5 5Wednesday, December 14, 11
  • 6. Motivation Current solution Proposed new approach Metadata by strata • Start time code • End time code • Content description - Action -Semantic shots - People Non- annotated assets Annotated assets GUI for semi-automatic annotation 6 6Wednesday, December 14, 11
  • 7. Outline Graphical User Interface Motivation Requirements Conclusions Semantic Shot Classifier Design Future work Evaluation Design Video-demo 7 7Wednesday, December 14, 11
  • 8. Semantic Shot Classifier • Annotation of the extracted keyframes Semantic Shot Medium Shot + Player Medium Shot + Box Requirements Semantic ConceptShot Type 8 8Wednesday, December 14, 11
  • 9. • President close-up • Close-up • Medium • General bureau • General chamber • Player close-up • Player medium • Stadium overview • Audience • Banner • Box Requirements Football Parlament 9 9Wednesday, December 14, 11
  • 10. Graphical User Interface • Embeddable in existing system (Digition) • User friendly • Domain-generic interface Requirements 10 10Wednesday, December 14, 11
  • 11. Outline Graphical User InterfaceRequirements Conclusions Semantic Shot Classifier Design Future work Evaluation Design Video-demo Motivation 11 11Wednesday, December 14, 11
  • 13. System Architecture Training: Test: >thr Decision Yes No MAX confidence ? Confidence 1 Confidence 2 Confidence N Classifier 2 Classifier N Classifier 1 Domain model MAX class Unknown SVM Trainer Model class 1 Model class 2 Model class N Design SVM Trainer SVM Trainer 13 13Wednesday, December 14, 11
  • 14. Annotation Football datasetFootball datasetFootball dataset Class 1 Medium 116 Class 2 Closer Up 87 Class 3 Stadium 4 Class 4 Audience 19 Class 5 Banner 4 Class 6 Box 6 Class 7 Unknown 278 514 Parliament datasetParliament datasetParliament dataset Class 1 President CU 16 Class 2 Close-up 63 Class 3 Medium 68 Class 4 Bureau 13 Class 5 Chamber 23 Class 6 Unknown 84 276 Design GAT annotation tool: http://upseek.upc.edu/gat 14 14Wednesday, December 14, 11
  • 15. Trainer parameters Unsupervised clustering to support multi-view: -MIN # elements / cluster -MAX cluster radius Design Detector parameters Confidence threshold >Thr. Decision Yes No Confidence Unknown 15 15Wednesday, December 14, 11
  • 16. Outline Graphical User InterfaceRequirements Conclusions Semantic Shot Classifier Design Future work Evaluation Design Video-demo Motivation 16 16Wednesday, December 14, 11
  • 17. Quality measure: F1-score Evaluation Class 1 Class 2 Class 3 Class1 Class 2 Class 3 4 1 0 0 3 0 2 0 3 Automatic M a n u a l 17 17Wednesday, December 14, 11
  • 18. Quality measure: F1-score Evaluation Class 1 Class 2 Class 3 Class 1 Class 2 Class 3 4 1 0 0 3 0 2 0 3 Automatic M a n u a l 18 18Wednesday, December 14, 11
  • 19. Cross Validation with Random Partition • Training - 80% of positive instances (+) - Positive instances from other classes are considered negative (-) Evaluation • Detection - 20% of the positive instances (+) 19 19Wednesday, December 14, 11
  • 20. 3 parameters to tune the classifier: • MAX radius for a multi-view cluster • MIN elements for a multi-view cluster • Confidence threshold Threshold (0.0 - 1.0) Evaluation Cross-validation iterations (3) Max radius (0.1 - 1.0) Min elements (0 - 5) F1 measure 20 20Wednesday, December 14, 11
  • 21. Results for Parliament domain • Optimal parameters – Threshold: 0.9 – MIN # elements: 3 – MAX radius: 0.8 Min Max radius elements 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1 0.718 0.962 0.952 0.949 0.958 0.970 0.957 0.957 0.949 0.946 2 0.715 0.962 0.955 0.952 0.952 0.962 0.951 0.951 0.966 0.965 3 0.769 0.948 0.968 0.960 0.960 0.957 0.959 0.972 0.944 0.958 4 0.763 0.955 0.965 0.962 0.956 0.963 0.968 0.966 0.954 0.964 5 0.719 0.970 0.966 0.947 0.958 0.959 0.958 0.961 0.953 0.967 Figure 7.4.: Catalan Parliament F1 measures for 0.9 minimum score Evaluation 21 21Wednesday, December 14, 11
  • 22. • “Unknown” labelled as “Close up”: 7.73% error • Semantic shots labelled as “Unknown”: 1.27% error F1 Measure • Close Up president: 0.98 • Close Up: 0.95 • Medium: 0.99 • General bureau: 0.98 • General chamber: 0.99 • Unknown: 0.94 Crom o Crom o PP Classe 3 CU Presi. CU Presi. Medium Bureau Chamber Unknown 93 0 0 0 0 3 0 376 0 0 0 2 0 0 405 0 0 3 0 0 0 75 0 3 0 0 0 0 135 3 0 39 0 0 0 465 Automatic M a n u a l CU Presi CU Med Bur Cha Un know Evaluation 22 22Wednesday, December 14, 11
  • 23. Results for Football domain • Optimal parameters – Threshold: 0.7 – MIN # elements: 1 – MAX radius: 0.8 Evaluation Min Max radius elements 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1 0.385 0.616 0.691 0.703 0.822 0.734 0.771 0.860 0.829 0.714 2 0 0 0.838 0.546 0.858 0.827 0.828 0.816 0.832 0.659 3 0 0 0.853 0.784 0.764 0.785 0.770 0.818 0.833 0.727 4 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 Figure 7.1.: Soccer match F1 measures for 0.7 minimum score 23 23Wednesday, December 14, 11
  • 24. Cromo Cromo PP Classe 3 3 Cromo Cromo PP C.Beauty Public Pancarta Llotja Unknown 629 1 0 0 0 0 33 4 482 0 0 0 0 36 1 0 20 0 0 0 3 0 0 0 94 0 0 20 0 0 0 0 18 0 6 0 1 0 0 0 21 14 461 3 0 0 0 0 1204 Automatic M a n u a l Cromo Cromo PP Cromo Beauty Public Panc. Llotja Un know Automatic Class1 Class2 Class3 Class4 Class5 Class6 Class7 Class1 652 11 0 0 0 0 33 Class2 4 482 0 0 0 0 36 Class3 1 0 20 0 0 0 3 Class4 0 0 0 94 0 0 20 Class5 0 0 0 0 18 0 6 Class6 0 1 0 0 0 21 14 Class7 461 3 0 0 0 0 1204 Table 7.3.: Best soccer match confusion matrix instances Class1 Class2 Class3 Class4 Class5 Class6 Class7 n 0.58 0.97 1.0 1.0 1.0 1.0 0.91 F1 Measure • Close-UP: 0.72 • Medium: 0.95 • C.Beauty: 0.91 • Public: 0.90 • Banner: 0.86 • Box: 0.74 • Unknown: 0.81 Evaluation class that causes that problem and the percentage error of the confusion. Manual Automatic % Error Cromo PP Cromo 0.76% Cromo Cromo PP 1.58% Cromo Beauty Cromo 4.16% Llotja Cromo PP 2.77% Cap pla Cromo 27.63% Cap pla Cromo PP 0.17% Totes les classes Cap pla 7.9% Figure 7.3.: Soccer match detection errors 24 24Wednesday, December 14, 11
  • 25. Outline Graphical User InterfaceRequirements Conclusions Semantic Shot Classifier Design Future work Evaluation Design Video-demo Motivation 25 25Wednesday, December 14, 11
  • 27. Web services • List of keyframes given an asset ID • Keyframe classifier System architecture Data exchange JSON Format 00_00_09_24 00_00_39_25 00_00_54_48 00_00_59_41 Asset 1257 Class 3 Score: 0.83 Class 3 Score 0.75 Class 5 Score 0.46 Class 2 Score 0.98 00_00_39_25 Football MIN elems MAX radius 00_00_54_48 Football MIN elems MAX radius 00_00_59_41 Football MIN elems MAX radius 00_00_09_24 Football MIN elems MAX radius Design 27 27Wednesday, December 14, 11
  • 28. Outline Graphical User InterfaceRequirements Conclusions Semantic Shot Classifier Design Future work Evaluation Design Video-demo Motivation 28 28Wednesday, December 14, 11
  • 32. Outline Graphical User InterfaceRequirements Conclusions Semantic Shot Classifier Design Future work Evaluation Design Video-demo Motivation 32 32Wednesday, December 14, 11
  • 33. • Web service to use new annotations to re-train • Use semantic shot information to launch additional analysis - Synthetic and natural signs -- Text recognition - Close ups -- Face recognition - Medium Shot Parliament -- Speech recognition Future work +, (
  • 34. 9 , !
  • 35. 01 5
  • 36. IMT
  • 37. +
  • 38. * 1
  • 39. )
  • 41. Outline Graphical User InterfaceRequirements Conclusions Semantic Shot Classifier Design Future work Evaluation Design Video-demo Motivation 34 34Wednesday, December 14, 11
  • 42. Conclusions Semantic shot classifier • Multiclass classification combining binary classifiers - Positive labels in a class become negative in others - Grid search for classifier parameters • Generic framework for multiple domains • Results good enough for exploitation Graphical User Interface • Uses classifier through web services • Flexible and user friend: - Sorting by score - Threshold can be manually changed - Drag and drop -Validation by page or shot type - Keyframe highlight • Annotated keyframes are valid for: - Training - Retrieval 35 35Wednesday, December 14, 11
  • 43. Outline Graphical User InterfaceRequirements Conclusions Semantic Shot Classifier Design Future work Evaluation Design Video-demo Motivation Thank you for you attention Follow our research blog: http://bitsearch.blogspot.com 36 36Wednesday, December 14, 11