SlideShare una empresa de Scribd logo
1 de 17
Descargar para leer sin conexión
YouTube-8M: A Large-Scale Video Classification
Benchmark (and Google Cloud ML Engine)
Slides by Dídac Surís
ReadAI Reading Group, UPC
13th March, 2017
Sami Abu-El-Haija, Nisarg Kothari, Joonseok Lee, Paul
Natsev, George Toderici, Balakrishnan Varadarajan,
Sudheendra Vijayanarasimhan
[arxiv] (27 Sep 2016) [web]
Index
1. YouTube-8M
a. Dataset
b. Baseline approaches
c. Results
2. Google Cloud ML Engine
Index
1. YouTube-8M
a. Dataset
b. Baseline approaches
c. Results
2. Google Cloud ML Engine
YouTube-8M: Dataset
Main features
● Multi-label (average 1.8)
● 4800 entities (24 top-level categories)
● 8, 264, 650 videos
● 500K hours of video
● Only visual entities
● Remove computational barriers
YouTube-8M: Dataset
Obtention
● YouTube video annotation system (metadata, context, …)
● First step: define entities
○ Human ratings to define entities (only visual ones)
○ At least 200 videos per entity
● Second step: collect videos
○ 10 M randomly sampled videos
○ Discard according to several
criteria
○ Split into train/validate/test
YouTube-8M: Dataset
Feature Extraction
● 50 years of video real time: impractical
● Sampling at 1 frame per second
● Frame-level feature extraction: fetch the ReLu activation of the last hidden
layer from the Inception network trained on ImageNet
● 2048 dimensions. With PCA + quantization size reduced 8x
● Audio features also extracted later:
https://www.kaggle.com/c/youtube8m/discussion/29475
YouTube-8M: Dataset
Not perfect ground truth
● 78.8 % precision
● 14.5 % recall
Index
1. YouTube-8M
a. Dataset
b. Baseline approaches
c. Results
2. Google Cloud ML Engine
YouTube-8M: Baseline approaches
Frame-level
Training of 4800 independent one-vs-all classifiers
1. Average pooling + logistic
○ The frame-level probabilities are aggregated
to the video-level using a simple average
2. Deep Bag of Frame (DBoF) Pooling
○ k frames projected to an M-dimensional space
with RELU activations
○ Batch normalization
○ Aggregation of frames with max-pooling
3. LSTM
○ 2 LSTM layers with 1024 hidden units
○ Linearly increasing per-frame weights going
from 1/N to 1 for the last frame.
YouTube-8M: Baseline approaches
Video-level
Only difference is that now we combine features before the
neural network: fixed-length video features
● Mean, standard deviation, top 5 ordinal statistics
● Posterior normalization (subtract mean, PCA)
Online learning algorithms instead of batch optimization (¿?)
1. Logistic regression
2. SVM (online) + Hinge loss
3. Mixture of Experts
Index
1. YouTube-8M
a. Dataset
b. Baseline approaches
c. Results
2. Google Cloud ML Engine
YouTube-8M: Results
Evaluation metrics and comparison
● Mean Average Precision
(Precision, Recall)
● Hit @k
● Precision at equal recall rate
(PERR)
These are results on the validation
set. On the human rated test set
the results are consistent.
YouTube-8M: Results
Results on other databases (transfer learning)
● Sports 1M
● Activity Net
Index
1. YouTube-8M
a. Dataset
b. Baseline approaches
c. Results
2. Google Cloud ML Engine
Google Cloud Machine Learning Engine
Basics
● Google Cloud Platform: 300 $ trial
● Google Cloud Shell
● Pricing
○ Training: in ML units (depending on scale tier) * hours
○ Prediction: Per hour + # of predictions
● Google Cloud Storage for the results
Google Cloud Machine Learning Engine
Task submission
Google Cloud Machine Learning Engine
TensorBoard

Más contenido relacionado

La actualidad más candente

B Eng Final Year Project Presentation
B Eng Final Year Project PresentationB Eng Final Year Project Presentation
B Eng Final Year Project Presentationjesujoseph
 
IRJET-Multiple Object Detection using Deep Neural Networks
IRJET-Multiple Object Detection using Deep Neural NetworksIRJET-Multiple Object Detection using Deep Neural Networks
IRJET-Multiple Object Detection using Deep Neural NetworksIRJET Journal
 
Deep Learning Fast MRI Using Channel Attention in Magnitude Domain
Deep Learning Fast MRI Using Channel Attention in Magnitude DomainDeep Learning Fast MRI Using Channel Attention in Magnitude Domain
Deep Learning Fast MRI Using Channel Attention in Magnitude DomainJoonhyung Lee
 
Review : Prototype Mixture Models for Few-shot Semantic Segmentation
Review : Prototype Mixture Models for Few-shot Semantic SegmentationReview : Prototype Mixture Models for Few-shot Semantic Segmentation
Review : Prototype Mixture Models for Few-shot Semantic SegmentationDongmin Choi
 
Denoising Unpaired Low Dose CT Images with Self-Ensembled CycleGAN
Denoising Unpaired Low Dose CT Images with Self-Ensembled CycleGANDenoising Unpaired Low Dose CT Images with Self-Ensembled CycleGAN
Denoising Unpaired Low Dose CT Images with Self-Ensembled CycleGANJoonhyung Lee
 
Performance Enhancement for Quality Inter-Layer Scalable Video Coding
Performance Enhancement for Quality Inter-Layer Scalable Video CodingPerformance Enhancement for Quality Inter-Layer Scalable Video Coding
Performance Enhancement for Quality Inter-Layer Scalable Video CodingIJCSIS Research Publications
 
A flexible method to create wave file features
A flexible method to create wave file features A flexible method to create wave file features
A flexible method to create wave file features IJECEIAES
 
Kassem2009
Kassem2009Kassem2009
Kassem2009lazchi
 
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTIONMEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTIONcsandit
 
Median based parallel steering kernel regression for image reconstruction
Median based parallel steering kernel regression for image reconstructionMedian based parallel steering kernel regression for image reconstruction
Median based parallel steering kernel regression for image reconstructioncsandit
 
Complex Background Subtraction Using Kalman Filter
Complex Background Subtraction Using Kalman FilterComplex Background Subtraction Using Kalman Filter
Complex Background Subtraction Using Kalman FilterIJERA Editor
 
Comparing Incremental Learning Strategies for Convolutional Neural Networks
Comparing Incremental Learning Strategies for Convolutional Neural NetworksComparing Incremental Learning Strategies for Convolutional Neural Networks
Comparing Incremental Learning Strategies for Convolutional Neural NetworksVincenzo Lomonaco
 
Bag of tricks for image classification with convolutional neural networks r...
Bag of tricks for image classification with convolutional neural networks   r...Bag of tricks for image classification with convolutional neural networks   r...
Bag of tricks for image classification with convolutional neural networks r...Dongmin Choi
 
Survey on optical flow estimation with DL
Survey on optical flow estimation with DLSurvey on optical flow estimation with DL
Survey on optical flow estimation with DLLeapMind Inc
 
Robust foreground modelling to segment and detect multiple moving objects in ...
Robust foreground modelling to segment and detect multiple moving objects in ...Robust foreground modelling to segment and detect multiple moving objects in ...
Robust foreground modelling to segment and detect multiple moving objects in ...IJECEIAES
 
Seed net automatic seed generation with deep reinforcement learning for robus...
Seed net automatic seed generation with deep reinforcement learning for robus...Seed net automatic seed generation with deep reinforcement learning for robus...
Seed net automatic seed generation with deep reinforcement learning for robus...NAVER Engineering
 
Image processing on matlab presentation
Image processing on matlab presentationImage processing on matlab presentation
Image processing on matlab presentationNaatchammai Ramanathan
 

La actualidad más candente (20)

B Eng Final Year Project Presentation
B Eng Final Year Project PresentationB Eng Final Year Project Presentation
B Eng Final Year Project Presentation
 
IRJET-Multiple Object Detection using Deep Neural Networks
IRJET-Multiple Object Detection using Deep Neural NetworksIRJET-Multiple Object Detection using Deep Neural Networks
IRJET-Multiple Object Detection using Deep Neural Networks
 
Deep Learning Fast MRI Using Channel Attention in Magnitude Domain
Deep Learning Fast MRI Using Channel Attention in Magnitude DomainDeep Learning Fast MRI Using Channel Attention in Magnitude Domain
Deep Learning Fast MRI Using Channel Attention in Magnitude Domain
 
Background subtraction
Background subtractionBackground subtraction
Background subtraction
 
Review : Prototype Mixture Models for Few-shot Semantic Segmentation
Review : Prototype Mixture Models for Few-shot Semantic SegmentationReview : Prototype Mixture Models for Few-shot Semantic Segmentation
Review : Prototype Mixture Models for Few-shot Semantic Segmentation
 
Denoising Unpaired Low Dose CT Images with Self-Ensembled CycleGAN
Denoising Unpaired Low Dose CT Images with Self-Ensembled CycleGANDenoising Unpaired Low Dose CT Images with Self-Ensembled CycleGAN
Denoising Unpaired Low Dose CT Images with Self-Ensembled CycleGAN
 
Performance Enhancement for Quality Inter-Layer Scalable Video Coding
Performance Enhancement for Quality Inter-Layer Scalable Video CodingPerformance Enhancement for Quality Inter-Layer Scalable Video Coding
Performance Enhancement for Quality Inter-Layer Scalable Video Coding
 
A flexible method to create wave file features
A flexible method to create wave file features A flexible method to create wave file features
A flexible method to create wave file features
 
Be36338341
Be36338341Be36338341
Be36338341
 
Kassem2009
Kassem2009Kassem2009
Kassem2009
 
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTIONMEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
MEDIAN BASED PARALLEL STEERING KERNEL REGRESSION FOR IMAGE RECONSTRUCTION
 
Median based parallel steering kernel regression for image reconstruction
Median based parallel steering kernel regression for image reconstructionMedian based parallel steering kernel regression for image reconstruction
Median based parallel steering kernel regression for image reconstruction
 
Complex Background Subtraction Using Kalman Filter
Complex Background Subtraction Using Kalman FilterComplex Background Subtraction Using Kalman Filter
Complex Background Subtraction Using Kalman Filter
 
Comparing Incremental Learning Strategies for Convolutional Neural Networks
Comparing Incremental Learning Strategies for Convolutional Neural NetworksComparing Incremental Learning Strategies for Convolutional Neural Networks
Comparing Incremental Learning Strategies for Convolutional Neural Networks
 
Bag of tricks for image classification with convolutional neural networks r...
Bag of tricks for image classification with convolutional neural networks   r...Bag of tricks for image classification with convolutional neural networks   r...
Bag of tricks for image classification with convolutional neural networks r...
 
Survey on optical flow estimation with DL
Survey on optical flow estimation with DLSurvey on optical flow estimation with DL
Survey on optical flow estimation with DL
 
Robust foreground modelling to segment and detect multiple moving objects in ...
Robust foreground modelling to segment and detect multiple moving objects in ...Robust foreground modelling to segment and detect multiple moving objects in ...
Robust foreground modelling to segment and detect multiple moving objects in ...
 
Keyframe-based Video Summarization Designer
Keyframe-based Video Summarization DesignerKeyframe-based Video Summarization Designer
Keyframe-based Video Summarization Designer
 
Seed net automatic seed generation with deep reinforcement learning for robus...
Seed net automatic seed generation with deep reinforcement learning for robus...Seed net automatic seed generation with deep reinforcement learning for robus...
Seed net automatic seed generation with deep reinforcement learning for robus...
 
Image processing on matlab presentation
Image processing on matlab presentationImage processing on matlab presentation
Image processing on matlab presentation
 

Destacado

Visual Translation Embedding Network for Visual Relation Detection (UPC Readi...
Visual Translation Embedding Network for Visual Relation Detection (UPC Readi...Visual Translation Embedding Network for Visual Relation Detection (UPC Readi...
Visual Translation Embedding Network for Visual Relation Detection (UPC Readi...Universitat Politècnica de Catalunya
 
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...Universitat Politècnica de Catalunya
 
How to invest in capital market
How to invest in capital marketHow to invest in capital market
How to invest in capital marketSabiha Jannat
 
Deep Learning for Computer Vision: Generative models and adversarial training...
Deep Learning for Computer Vision: Generative models and adversarial training...Deep Learning for Computer Vision: Generative models and adversarial training...
Deep Learning for Computer Vision: Generative models and adversarial training...Universitat Politècnica de Catalunya
 
La figura del director en la LOMCE
La figura del director en la LOMCELa figura del director en la LOMCE
La figura del director en la LOMCEMiguel Miguel
 
Prot. 337 17 mensagem de veto 002 - integral ao autógrafo de lei nº 3.602-16
Prot. 337 17   mensagem de veto 002 - integral ao autógrafo de lei nº 3.602-16Prot. 337 17   mensagem de veto 002 - integral ao autógrafo de lei nº 3.602-16
Prot. 337 17 mensagem de veto 002 - integral ao autógrafo de lei nº 3.602-16Claudio Figueiredo
 
Defective products
Defective productsDefective products
Defective productsKyle Larson
 
Creating new classes of objects with deep generative neural nets
Creating new classes of objects with deep generative neural netsCreating new classes of objects with deep generative neural nets
Creating new classes of objects with deep generative neural netsAkin Osman Kazakci
 
Paper crf design_tools
Paper crf design_toolsPaper crf design_tools
Paper crf design_toolsDave John
 
Conditional Random Fields - Vidya Venkiteswaran
Conditional Random Fields - Vidya VenkiteswaranConditional Random Fields - Vidya Venkiteswaran
Conditional Random Fields - Vidya VenkiteswaranWithTheBest
 
Project Portfolio Summaries
Project Portfolio SummariesProject Portfolio Summaries
Project Portfolio SummariesTA Instruments
 
Web本文抽出 using crf
Web本文抽出 using crfWeb本文抽出 using crf
Web本文抽出 using crfShuyo Nakatani
 
Machine Learning: Generative and Discriminative Models
Machine Learning: Generative and Discriminative ModelsMachine Learning: Generative and Discriminative Models
Machine Learning: Generative and Discriminative Modelsbutest
 

Destacado (20)

Visual Translation Embedding Network for Visual Relation Detection (UPC Readi...
Visual Translation Embedding Network for Visual Relation Detection (UPC Readi...Visual Translation Embedding Network for Visual Relation Detection (UPC Readi...
Visual Translation Embedding Network for Visual Relation Detection (UPC Readi...
 
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
Skin Lesion Detection from Dermoscopic Images using Convolutional Neural Netw...
 
How to invest in capital market
How to invest in capital marketHow to invest in capital market
How to invest in capital market
 
Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)Deep Learning for Computer Vision: Attention Models (UPC 2016)
Deep Learning for Computer Vision: Attention Models (UPC 2016)
 
Deep Learning for Computer Vision: Generative models and adversarial training...
Deep Learning for Computer Vision: Generative models and adversarial training...Deep Learning for Computer Vision: Generative models and adversarial training...
Deep Learning for Computer Vision: Generative models and adversarial training...
 
La figura del director en la LOMCE
La figura del director en la LOMCELa figura del director en la LOMCE
La figura del director en la LOMCE
 
Baptist Visitor, 2016
Baptist Visitor, 2016Baptist Visitor, 2016
Baptist Visitor, 2016
 
Prot. 337 17 mensagem de veto 002 - integral ao autógrafo de lei nº 3.602-16
Prot. 337 17   mensagem de veto 002 - integral ao autógrafo de lei nº 3.602-16Prot. 337 17   mensagem de veto 002 - integral ao autógrafo de lei nº 3.602-16
Prot. 337 17 mensagem de veto 002 - integral ao autógrafo de lei nº 3.602-16
 
Defective products
Defective productsDefective products
Defective products
 
Creating new classes of objects with deep generative neural nets
Creating new classes of objects with deep generative neural netsCreating new classes of objects with deep generative neural nets
Creating new classes of objects with deep generative neural nets
 
Paper crf design_tools
Paper crf design_toolsPaper crf design_tools
Paper crf design_tools
 
Tools for Image Retrieval in Large Multimedia Databases
Tools for Image Retrieval in Large Multimedia DatabasesTools for Image Retrieval in Large Multimedia Databases
Tools for Image Retrieval in Large Multimedia Databases
 
Conditional Random Fields - Vidya Venkiteswaran
Conditional Random Fields - Vidya VenkiteswaranConditional Random Fields - Vidya Venkiteswaran
Conditional Random Fields - Vidya Venkiteswaran
 
Project Portfolio Summaries
Project Portfolio SummariesProject Portfolio Summaries
Project Portfolio Summaries
 
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
Deep Learning for Computer Vision: Data Augmentation (UPC 2016)
 
Deep Learning for Computer Vision: Optimization (UPC 2016)
Deep Learning for Computer Vision: Optimization (UPC 2016)Deep Learning for Computer Vision: Optimization (UPC 2016)
Deep Learning for Computer Vision: Optimization (UPC 2016)
 
Web本文抽出 using crf
Web本文抽出 using crfWeb本文抽出 using crf
Web本文抽出 using crf
 
Machine Learning: Generative and Discriminative Models
Machine Learning: Generative and Discriminative ModelsMachine Learning: Generative and Discriminative Models
Machine Learning: Generative and Discriminative Models
 
Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)
Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)
Deep Learning for Computer Vision: Saliency Prediction (UPC 2016)
 
Region-oriented Convolutional Networks for Object Retrieval
Region-oriented Convolutional Networks for Object RetrievalRegion-oriented Convolutional Networks for Object Retrieval
Region-oriented Convolutional Networks for Object Retrieval
 

Similar a YouTube-8M: A Large-Scale Video Classification Benchmark (UPC Reading Group)

Mtech Second progresspresentation ON VIDEO SUMMARIZATION
Mtech Second progresspresentation ON VIDEO SUMMARIZATIONMtech Second progresspresentation ON VIDEO SUMMARIZATION
Mtech Second progresspresentation ON VIDEO SUMMARIZATIONNEERAJ BAGHEL
 
Tutorial-on-DNN-09A-Co-design-Sparsity.pdf
Tutorial-on-DNN-09A-Co-design-Sparsity.pdfTutorial-on-DNN-09A-Co-design-Sparsity.pdf
Tutorial-on-DNN-09A-Co-design-Sparsity.pdfDuy-Hieu Bui
 
Deep neural networks for Youtube recommendations
Deep neural networks for Youtube recommendationsDeep neural networks for Youtube recommendations
Deep neural networks for Youtube recommendationsAryan Khandal
 
Image Object Detection Pipeline
Image Object Detection PipelineImage Object Detection Pipeline
Image Object Detection PipelineAbhinav Dadhich
 
IRJET- Storage Optimization of Video Surveillance from CCTV Camera
IRJET- Storage Optimization of Video Surveillance from CCTV CameraIRJET- Storage Optimization of Video Surveillance from CCTV Camera
IRJET- Storage Optimization of Video Surveillance from CCTV CameraIRJET Journal
 
Activity Recognition project
Activity Recognition projectActivity Recognition project
Activity Recognition projectAndreaNapoletani
 
Sprint 50 review
Sprint 50 reviewSprint 50 review
Sprint 50 reviewManageIQ
 
5 ijaems sept-2015-9-video feature extraction based on modified lle using ada...
5 ijaems sept-2015-9-video feature extraction based on modified lle using ada...5 ijaems sept-2015-9-video feature extraction based on modified lle using ada...
5 ijaems sept-2015-9-video feature extraction based on modified lle using ada...INFOGAIN PUBLICATION
 
ML Paper Tutorial - Video Face Manipulation Detection Through Ensemble of CNN...
ML Paper Tutorial - Video Face Manipulation Detection Through Ensemble of CNN...ML Paper Tutorial - Video Face Manipulation Detection Through Ensemble of CNN...
ML Paper Tutorial - Video Face Manipulation Detection Through Ensemble of CNN...Pei-Yuan Chien
 
Key frame extraction for video summarization using motion activity descriptors
Key frame extraction for video summarization using motion activity descriptorsKey frame extraction for video summarization using motion activity descriptors
Key frame extraction for video summarization using motion activity descriptorseSAT Publishing House
 
Key frame extraction for video summarization using motion activity descriptors
Key frame extraction for video summarization using motion activity descriptorsKey frame extraction for video summarization using motion activity descriptors
Key frame extraction for video summarization using motion activity descriptorseSAT Journals
 
USING IMAGE CLASSIFICATION TO INCENTIVIZE RECYCLING
USING IMAGE CLASSIFICATION TO INCENTIVIZE RECYCLINGUSING IMAGE CLASSIFICATION TO INCENTIVIZE RECYCLING
USING IMAGE CLASSIFICATION TO INCENTIVIZE RECYCLINGIRJET Journal
 
Effective Compression of Digital Video
Effective Compression of Digital VideoEffective Compression of Digital Video
Effective Compression of Digital VideoIRJET Journal
 
Sprint 44 review
Sprint 44 reviewSprint 44 review
Sprint 44 reviewManageIQ
 
Real Time Object Dectection using machine learning
Real Time Object Dectection using machine learningReal Time Object Dectection using machine learning
Real Time Object Dectection using machine learningpratik pratyay
 

Similar a YouTube-8M: A Large-Scale Video Classification Benchmark (UPC Reading Group) (20)

Mtech Second progresspresentation ON VIDEO SUMMARIZATION
Mtech Second progresspresentation ON VIDEO SUMMARIZATIONMtech Second progresspresentation ON VIDEO SUMMARIZATION
Mtech Second progresspresentation ON VIDEO SUMMARIZATION
 
Sprint 71
Sprint 71Sprint 71
Sprint 71
 
Tutorial-on-DNN-09A-Co-design-Sparsity.pdf
Tutorial-on-DNN-09A-Co-design-Sparsity.pdfTutorial-on-DNN-09A-Co-design-Sparsity.pdf
Tutorial-on-DNN-09A-Co-design-Sparsity.pdf
 
Managing 600 instances
Managing 600 instancesManaging 600 instances
Managing 600 instances
 
Deep neural networks for Youtube recommendations
Deep neural networks for Youtube recommendationsDeep neural networks for Youtube recommendations
Deep neural networks for Youtube recommendations
 
Image Object Detection Pipeline
Image Object Detection PipelineImage Object Detection Pipeline
Image Object Detection Pipeline
 
IRJET- Storage Optimization of Video Surveillance from CCTV Camera
IRJET- Storage Optimization of Video Surveillance from CCTV CameraIRJET- Storage Optimization of Video Surveillance from CCTV Camera
IRJET- Storage Optimization of Video Surveillance from CCTV Camera
 
Activity Recognition project
Activity Recognition projectActivity Recognition project
Activity Recognition project
 
2021 05-04-u2-net
2021 05-04-u2-net2021 05-04-u2-net
2021 05-04-u2-net
 
Sprint 50 review
Sprint 50 reviewSprint 50 review
Sprint 50 review
 
5 ijaems sept-2015-9-video feature extraction based on modified lle using ada...
5 ijaems sept-2015-9-video feature extraction based on modified lle using ada...5 ijaems sept-2015-9-video feature extraction based on modified lle using ada...
5 ijaems sept-2015-9-video feature extraction based on modified lle using ada...
 
Practical ML
Practical MLPractical ML
Practical ML
 
ML Paper Tutorial - Video Face Manipulation Detection Through Ensemble of CNN...
ML Paper Tutorial - Video Face Manipulation Detection Through Ensemble of CNN...ML Paper Tutorial - Video Face Manipulation Detection Through Ensemble of CNN...
ML Paper Tutorial - Video Face Manipulation Detection Through Ensemble of CNN...
 
Video Thumbnail Selector
Video Thumbnail SelectorVideo Thumbnail Selector
Video Thumbnail Selector
 
Key frame extraction for video summarization using motion activity descriptors
Key frame extraction for video summarization using motion activity descriptorsKey frame extraction for video summarization using motion activity descriptors
Key frame extraction for video summarization using motion activity descriptors
 
Key frame extraction for video summarization using motion activity descriptors
Key frame extraction for video summarization using motion activity descriptorsKey frame extraction for video summarization using motion activity descriptors
Key frame extraction for video summarization using motion activity descriptors
 
USING IMAGE CLASSIFICATION TO INCENTIVIZE RECYCLING
USING IMAGE CLASSIFICATION TO INCENTIVIZE RECYCLINGUSING IMAGE CLASSIFICATION TO INCENTIVIZE RECYCLING
USING IMAGE CLASSIFICATION TO INCENTIVIZE RECYCLING
 
Effective Compression of Digital Video
Effective Compression of Digital VideoEffective Compression of Digital Video
Effective Compression of Digital Video
 
Sprint 44 review
Sprint 44 reviewSprint 44 review
Sprint 44 review
 
Real Time Object Dectection using machine learning
Real Time Object Dectection using machine learningReal Time Object Dectection using machine learning
Real Time Object Dectection using machine learning
 

Más de Universitat Politècnica de Catalunya

The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoUniversitat Politècnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Universitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosUniversitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Universitat Politècnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Universitat Politècnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Universitat Politècnica de Catalunya
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Universitat Politècnica de Catalunya
 

Más de Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
 

Último

VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxolyaivanovalion
 

Último (20)

VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
ALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptxALSO dropshipping via API with DroFx.pptx
ALSO dropshipping via API with DroFx.pptx
 

YouTube-8M: A Large-Scale Video Classification Benchmark (UPC Reading Group)

  • 1. YouTube-8M: A Large-Scale Video Classification Benchmark (and Google Cloud ML Engine) Slides by Dídac Surís ReadAI Reading Group, UPC 13th March, 2017 Sami Abu-El-Haija, Nisarg Kothari, Joonseok Lee, Paul Natsev, George Toderici, Balakrishnan Varadarajan, Sudheendra Vijayanarasimhan [arxiv] (27 Sep 2016) [web]
  • 2. Index 1. YouTube-8M a. Dataset b. Baseline approaches c. Results 2. Google Cloud ML Engine
  • 3. Index 1. YouTube-8M a. Dataset b. Baseline approaches c. Results 2. Google Cloud ML Engine
  • 4. YouTube-8M: Dataset Main features ● Multi-label (average 1.8) ● 4800 entities (24 top-level categories) ● 8, 264, 650 videos ● 500K hours of video ● Only visual entities ● Remove computational barriers
  • 5. YouTube-8M: Dataset Obtention ● YouTube video annotation system (metadata, context, …) ● First step: define entities ○ Human ratings to define entities (only visual ones) ○ At least 200 videos per entity ● Second step: collect videos ○ 10 M randomly sampled videos ○ Discard according to several criteria ○ Split into train/validate/test
  • 6. YouTube-8M: Dataset Feature Extraction ● 50 years of video real time: impractical ● Sampling at 1 frame per second ● Frame-level feature extraction: fetch the ReLu activation of the last hidden layer from the Inception network trained on ImageNet ● 2048 dimensions. With PCA + quantization size reduced 8x ● Audio features also extracted later: https://www.kaggle.com/c/youtube8m/discussion/29475
  • 7. YouTube-8M: Dataset Not perfect ground truth ● 78.8 % precision ● 14.5 % recall
  • 8. Index 1. YouTube-8M a. Dataset b. Baseline approaches c. Results 2. Google Cloud ML Engine
  • 9. YouTube-8M: Baseline approaches Frame-level Training of 4800 independent one-vs-all classifiers 1. Average pooling + logistic ○ The frame-level probabilities are aggregated to the video-level using a simple average 2. Deep Bag of Frame (DBoF) Pooling ○ k frames projected to an M-dimensional space with RELU activations ○ Batch normalization ○ Aggregation of frames with max-pooling 3. LSTM ○ 2 LSTM layers with 1024 hidden units ○ Linearly increasing per-frame weights going from 1/N to 1 for the last frame.
  • 10. YouTube-8M: Baseline approaches Video-level Only difference is that now we combine features before the neural network: fixed-length video features ● Mean, standard deviation, top 5 ordinal statistics ● Posterior normalization (subtract mean, PCA) Online learning algorithms instead of batch optimization (¿?) 1. Logistic regression 2. SVM (online) + Hinge loss 3. Mixture of Experts
  • 11. Index 1. YouTube-8M a. Dataset b. Baseline approaches c. Results 2. Google Cloud ML Engine
  • 12. YouTube-8M: Results Evaluation metrics and comparison ● Mean Average Precision (Precision, Recall) ● Hit @k ● Precision at equal recall rate (PERR) These are results on the validation set. On the human rated test set the results are consistent.
  • 13. YouTube-8M: Results Results on other databases (transfer learning) ● Sports 1M ● Activity Net
  • 14. Index 1. YouTube-8M a. Dataset b. Baseline approaches c. Results 2. Google Cloud ML Engine
  • 15. Google Cloud Machine Learning Engine Basics ● Google Cloud Platform: 300 $ trial ● Google Cloud Shell ● Pricing ○ Training: in ML units (depending on scale tier) * hours ○ Prediction: Per hour + # of predictions ● Google Cloud Storage for the results
  • 16. Google Cloud Machine Learning Engine Task submission
  • 17. Google Cloud Machine Learning Engine TensorBoard