SlideShare una empresa de Scribd logo
1 de 50
Descargar para leer sin conexión
Deep Within-Class Covariance
Analysis for Robust Deep Audio
Representation Learning
Hamid Eghbal-zadeh 1,2
, Matthias Dorfer 1
, Gerhard Widmer 1,2
1 2
Deep Within-Class Covariance
Analysis for Robust Deep Audio
Representation Learning
Hamid Eghbal-zadeh 1,2
, Matthias Dorfer 1
, Gerhard Widmer 1,2
1 2
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Motivation
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
● Convolutional Neural Networks learn useful features and build good
representations
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
● Convolutional Neural Networks learn useful features and build good
representations
● CNNs are also known to generalize on the unseen data
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
● Convolutional Neural Networks learn useful features and build good
representations
● CNNs are also known to generalize on the unseen data
● Many of the benchmark datasets have similar train/test distributions
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
● Convolutional Neural Networks learn useful features and build good
representations
● CNNs are also known to generalize on the unseen data
● Many of the benchmark datasets have similar train/test distributions
● How about a distribution mismatch between training and test?
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Distribution mismatch:
When the distribution of the data in training and validation sets differ from
the test set
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Distribution mismatch:
When the distribution of the data in training and validation sets differ from
the test set
● Speaker Recognition: Training on English, testing on Chinese
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Distribution mismatch:
When the distribution of the data in training and validation sets differ from
the test set
● Speaker Recognition: Training on English, testing on Chinese
● Acoustic Scene Classification: Training on Scenes in one country, testing on
scenes of another country, in another period of time
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Distribution mismatch:
When the distribution of the data in training and validation sets differ from
the test set
● Speaker Recognition: Training on English, testing on Chinese
● Acoustic Scene Classification: Training on Scenes in one country, testing on
scenes of another country, in another period of time
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Performance of end-to-end CNNs (no mismatch vs mismatched):
● We use DCASE2016 (no mismatch) and DCASE2017 (mismatched) datasets1
● Same training and validation, different test set
● Look at several end-to-end CNNs
1) Detection and Classification of Acoustic Scenes and Events, http://dcase.community
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Covariance Analysis of
the representation
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Covariance Eigenvalue Analysis:
● We train a VGG network on No mismatch and Mismatched using
spectrograms
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Covariance Eigenvalue Analysis:
● We train a VGG network on No mismatch and Mismatched using
spectrograms
● We analyse the internal representation of the VGG
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Covariance Eigenvalue Analysis:
● We train a VGG network on No mismatch and Mismatched using
spectrograms
● We analyse the internal representation of the VGG
● We use covariance analysis
○ Eigen-values of the covariances matrix
○ Visualisation of the representations projected via PCA
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Nomismatch
Covariance Eigenvalue Analysis:
Train Test
Mismatched
Validation
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
NomismatchVisualisation of the VGG representations:
Train Validation Test
Mismatched
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Within-Class Covariance
Normalisation (WCCN)
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Within-Class Covariance Normalization1,2
:
● Proposed for Speaker Recognition to reduce the false
positive/negatives
1) Hatch, Andrew O., et al. "Within-class covariance normalization for SVM-based speaker recognition." Ninth international conference on spoken
language processing. 2006.
2) Hatch, Andrew O., et al. "Generalized linear kernels for one-versus-all classification: application to speaker recognition." Acoustics, Speech and
Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on. Vol. 5. IEEE, 2006.
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Within-Class Covariance Normalization1,2
:
● Proposed for Speaker Recognition to reduce the false
positive/negatives
● Used to reduce the within-class variability in features such as
GMM supervectors or i-vector features
1) Hatch, Andrew O., et al. "Within-class covariance normalization for SVM-based speaker recognition." Ninth international conference on spoken
language processing. 2006.
2) Hatch, Andrew O., et al. "Generalized linear kernels for one-versus-all classification: application to speaker recognition." Acoustics, Speech and
Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on. Vol. 5. IEEE, 2006.
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Within-Class Covariance Normalization1,2
:
1) Hatch, Andrew O., et al. "Within-class covariance normalization for SVM-based speaker recognition." Ninth international conference on spoken
language processing. 2006.
2) Hatch, Andrew O., et al. "Generalized linear kernels for one-versus-all classification: application to speaker recognition." Acoustics, Speech and
Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on. Vol. 5. IEEE, 2006.
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Deep Within-Class
Covariance Analysis
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Deep Within-Class Covariance Analysis (DWCCA):
● A deep learning compatible version of WCCN
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Deep Within-Class Covariance Analysis (DWCCA):
● A deep learning compatible version of WCCN
● A statistical DL layer, trained end-to-end using SGD with minibatches
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Deep Within-Class Covariance Analysis (DWCCA):
● A deep learning compatible version of WCCN
● A statistical DL layer, trained end-to-end using SGD with minibatches
● Can be placed anywhere to reduce the within-class variability
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Deep Within-Class Covariance Analysis (DWCCA):
● A deep learning compatible version of WCCN
● A statistical DL layer, trained end-to-end using SGD with minibatches
● Can be placed anywhere to reduce the within-class variability
● B in training is equal to Bb
in forward pass
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Deep Within-Class Covariance Analysis (DWCCA):
● A deep learning compatible version of WCCN
● A statistical DL layer, trained end-to-end using SGD with minibatches
● Can be placed anywhere to reduce the within-class variability
● B in training is equal to Bb
in forward pass
● Gradients wrt B are computed and used in backward pass
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Deep Within-Class Covariance Analysis (DWCCA):
● A deep learning compatible version of WCCN
● A statistical DL layer, trained end-to-end using SGD with minibatches
● Can be placed anywhere to reduce the within-class variability
● B in training is equal to Bb
in forward pass
● Gradients wrt B are computed and used in backward pass
● A running average is computed for test time (similar to batchnorm)
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Deep Within-Class Covariance Analysis (DWCCA):
● A deep learning compatible version of WCCN
● A statistical DL layer, trained end-to-end using SGD with minibatches
● Can be placed anywhere to reduce the within-class variability
● B in training is equal to Bb
in forward pass
● Gradients wrt B are computed and used in backward pass
● A running average is computed for test time (similar to batchnorm)
● Compatible with different supervised
tasks (Classification, Detection,
metric learning...) and data (raw audio...)
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Deep Within-Class Covariance Analysis (DWCCA):
● A deep learning compatible version of WCCN
● A statistical DL layer, trained end-to-end using SGD with minibatches
● Can be placed anywhere to reduce the within-class variability
● B in training is equal to Bb
in forward pass
● Gradients wrt B are computed and used in backward pass
● A running average is computed for test time (similar to batchnorm)
● Compatible with different supervised
tasks (Classification, Detection,
metric learning...) and data (raw audio...)
● Can be used with different supervised
losses (CCE, BCE, l2
, ...)
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Results
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Nomismatch
Within-Class Covariance Eigenvalue Analysis (Without DWCCA):
Train Validation Test
Mismatched
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Nomismatch
Within-Class Covariance Eigenvalue Analysis (With DWCCA):
Train Test
Mismatched
Validation
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Nomismatch
Eigenvalue Analysis (With vs without DWCCA):
Train Test
Mismatched
Validation
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Nomismatch
K-NN classification results on VGG representations
Validation Test
Mismatched
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
*: Single model, Single-channel features
: Multi-channel features
:Ensemble of various models
NomismatchMismatched
End-to-end classification:
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
*: Single model, Single-channel features
: Multi-channel features
:Ensemble of various models
NomismatchMismatched
End-to-end classification:
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
*: Single model, Single-channel features
: Multi-channel features
:Ensemble of various models
NomismatchMismatched
End-to-end classification:
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
*: Single model, Single-channel features
: Multi-channel features
:Ensemble of various models
NomismatchMismatched
End-to-end classification:
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
*: Single model, Single-channel features
: Multi-channel features
:Ensemble of various models
NomismatchMismatched
End-to-end classification:
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
MismatchedNo mismatch
End-to-end class-wise F1:
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
MismatchedNo mismatch
End-to-end class-wise F1:
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Summary
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Summary:
● We analysed covariance of the representations in a VGG
network
Nomismatch
Train Test
Mismatched
Validation
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Summary:
● We analysed covariance of the representations in a VGG
network
● We showed that the more mismatch there is between
training and test, the more within-class variability increases
in the representation Nomismatch
Train Test
Mismatched
Validation
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Summary:
● We analysed covariance of the representations in a VGG
network
● We showed that the more mismatch there is between
training and test, the more within-class variability increases
in the representation
● We proposed Deep Within-class Covariance Analysis, a
deep learning compatible layer capable of significantly
reducing within-class variability of a network’s
representation
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Summary:
● We analysed covariance of the representations in a VGG
network
● We showed that the more mismatch there is between
training and test, the more within-class variability increases
in the representation
● We proposed Deep Within-class Covariance Analysis, a
deep learning compatible layer capable of significantly
reducing within-class variability of a network’s
representation
● We empirically showed that DWCCA improves the
generalisation when the training and test have mismatched
distributions.
Nomismatch
Validation Test
Mismatched
Motivation Covariance Analysis WCCN DWCCA Results Summary
Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning
Thank you for your attention!
Come to the poster for more
discussions.
hamid.eghbal-zadeh@jku.at
heghbalz

Más contenido relacionado

Similar a Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning

Similar a Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning (10)

Stable Diffusion path
Stable Diffusion pathStable Diffusion path
Stable Diffusion path
 
Idiom Token Classification using Sentential Distributed Semantics (Giancarlo ...
Idiom Token Classification using Sentential Distributed Semantics (Giancarlo ...Idiom Token Classification using Sentential Distributed Semantics (Giancarlo ...
Idiom Token Classification using Sentential Distributed Semantics (Giancarlo ...
 
Audio and Vision (D2L9 Insight@DCU Machine Learning Workshop 2017)
Audio and Vision (D2L9 Insight@DCU Machine Learning Workshop 2017)Audio and Vision (D2L9 Insight@DCU Machine Learning Workshop 2017)
Audio and Vision (D2L9 Insight@DCU Machine Learning Workshop 2017)
 
Introduction to deep learning based voice activity detection
Introduction to deep learning based voice activity detectionIntroduction to deep learning based voice activity detection
Introduction to deep learning based voice activity detection
 
Audio and Vision (D4L6 2017 UPC Deep Learning for Computer Vision)
Audio and Vision (D4L6 2017 UPC Deep Learning for Computer Vision)Audio and Vision (D4L6 2017 UPC Deep Learning for Computer Vision)
Audio and Vision (D4L6 2017 UPC Deep Learning for Computer Vision)
 
Fudan-NJUST at MediaEval 2014: Violent Scenes Detection Using Deep Neural Net...
Fudan-NJUST at MediaEval 2014: Violent Scenes Detection Using Deep Neural Net...Fudan-NJUST at MediaEval 2014: Violent Scenes Detection Using Deep Neural Net...
Fudan-NJUST at MediaEval 2014: Violent Scenes Detection Using Deep Neural Net...
 
Tiancheng Zhao - 2017 - Learning Discourse-level Diversity for Neural Dialog...
Tiancheng Zhao - 2017 -  Learning Discourse-level Diversity for Neural Dialog...Tiancheng Zhao - 2017 -  Learning Discourse-level Diversity for Neural Dialog...
Tiancheng Zhao - 2017 - Learning Discourse-level Diversity for Neural Dialog...
 
EMNLP 2014: Opinion Mining with Deep Recurrent Neural Network
EMNLP 2014: Opinion Mining with Deep Recurrent Neural NetworkEMNLP 2014: Opinion Mining with Deep Recurrent Neural Network
EMNLP 2014: Opinion Mining with Deep Recurrent Neural Network
 
From Semantics to Self-supervised Learning for Speech and Beyond
From Semantics to Self-supervised Learning for Speech and BeyondFrom Semantics to Self-supervised Learning for Speech and Beyond
From Semantics to Self-supervised Learning for Speech and Beyond
 
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
MohamedFarag457087
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
1301aanya
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
Silpa
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
Silpa
 

Último (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
Introduction of DNA analysis in Forensic's .pptx
Introduction of DNA analysis in Forensic's .pptxIntroduction of DNA analysis in Forensic's .pptx
Introduction of DNA analysis in Forensic's .pptx
 
biology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGYbiology HL practice questions IB BIOLOGY
biology HL practice questions IB BIOLOGY
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICEPATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
PATNA CALL GIRLS 8617370543 LOW PRICE ESCORT SERVICE
 
POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.POGONATUM : morphology, anatomy, reproduction etc.
POGONATUM : morphology, anatomy, reproduction etc.
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
Human genetics..........................pptx
Human genetics..........................pptxHuman genetics..........................pptx
Human genetics..........................pptx
 
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)COMPUTING ANTI-DERIVATIVES(Integration by SUBSTITUTION)
COMPUTING ANTI-DERIVATIVES (Integration by SUBSTITUTION)
 
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS  ESCORT SERVICE In Bhiwan...
Bhiwandi Bhiwandi ❤CALL GIRL 7870993772 ❤CALL GIRLS ESCORT SERVICE In Bhiwan...
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Velocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.pptVelocity and Acceleration PowerPoint.ppt
Velocity and Acceleration PowerPoint.ppt
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 

Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning

  • 1.
  • 2. Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Hamid Eghbal-zadeh 1,2 , Matthias Dorfer 1 , Gerhard Widmer 1,2 1 2
  • 3. Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Hamid Eghbal-zadeh 1,2 , Matthias Dorfer 1 , Gerhard Widmer 1,2 1 2
  • 4. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Motivation
  • 5. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning ● Convolutional Neural Networks learn useful features and build good representations
  • 6. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning ● Convolutional Neural Networks learn useful features and build good representations ● CNNs are also known to generalize on the unseen data
  • 7. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning ● Convolutional Neural Networks learn useful features and build good representations ● CNNs are also known to generalize on the unseen data ● Many of the benchmark datasets have similar train/test distributions
  • 8. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning ● Convolutional Neural Networks learn useful features and build good representations ● CNNs are also known to generalize on the unseen data ● Many of the benchmark datasets have similar train/test distributions ● How about a distribution mismatch between training and test?
  • 9. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Distribution mismatch: When the distribution of the data in training and validation sets differ from the test set
  • 10. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Distribution mismatch: When the distribution of the data in training and validation sets differ from the test set ● Speaker Recognition: Training on English, testing on Chinese
  • 11. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Distribution mismatch: When the distribution of the data in training and validation sets differ from the test set ● Speaker Recognition: Training on English, testing on Chinese ● Acoustic Scene Classification: Training on Scenes in one country, testing on scenes of another country, in another period of time
  • 12. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Distribution mismatch: When the distribution of the data in training and validation sets differ from the test set ● Speaker Recognition: Training on English, testing on Chinese ● Acoustic Scene Classification: Training on Scenes in one country, testing on scenes of another country, in another period of time
  • 13. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Performance of end-to-end CNNs (no mismatch vs mismatched): ● We use DCASE2016 (no mismatch) and DCASE2017 (mismatched) datasets1 ● Same training and validation, different test set ● Look at several end-to-end CNNs 1) Detection and Classification of Acoustic Scenes and Events, http://dcase.community
  • 14. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Covariance Analysis of the representation
  • 15. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Covariance Eigenvalue Analysis: ● We train a VGG network on No mismatch and Mismatched using spectrograms
  • 16. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Covariance Eigenvalue Analysis: ● We train a VGG network on No mismatch and Mismatched using spectrograms ● We analyse the internal representation of the VGG
  • 17. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Covariance Eigenvalue Analysis: ● We train a VGG network on No mismatch and Mismatched using spectrograms ● We analyse the internal representation of the VGG ● We use covariance analysis ○ Eigen-values of the covariances matrix ○ Visualisation of the representations projected via PCA
  • 18. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Nomismatch Covariance Eigenvalue Analysis: Train Test Mismatched Validation
  • 19. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning NomismatchVisualisation of the VGG representations: Train Validation Test Mismatched
  • 20. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Within-Class Covariance Normalisation (WCCN)
  • 21. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Within-Class Covariance Normalization1,2 : ● Proposed for Speaker Recognition to reduce the false positive/negatives 1) Hatch, Andrew O., et al. "Within-class covariance normalization for SVM-based speaker recognition." Ninth international conference on spoken language processing. 2006. 2) Hatch, Andrew O., et al. "Generalized linear kernels for one-versus-all classification: application to speaker recognition." Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on. Vol. 5. IEEE, 2006.
  • 22. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Within-Class Covariance Normalization1,2 : ● Proposed for Speaker Recognition to reduce the false positive/negatives ● Used to reduce the within-class variability in features such as GMM supervectors or i-vector features 1) Hatch, Andrew O., et al. "Within-class covariance normalization for SVM-based speaker recognition." Ninth international conference on spoken language processing. 2006. 2) Hatch, Andrew O., et al. "Generalized linear kernels for one-versus-all classification: application to speaker recognition." Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on. Vol. 5. IEEE, 2006.
  • 23. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Within-Class Covariance Normalization1,2 : 1) Hatch, Andrew O., et al. "Within-class covariance normalization for SVM-based speaker recognition." Ninth international conference on spoken language processing. 2006. 2) Hatch, Andrew O., et al. "Generalized linear kernels for one-versus-all classification: application to speaker recognition." Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on. Vol. 5. IEEE, 2006.
  • 24. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Deep Within-Class Covariance Analysis
  • 25. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Deep Within-Class Covariance Analysis (DWCCA): ● A deep learning compatible version of WCCN
  • 26. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Deep Within-Class Covariance Analysis (DWCCA): ● A deep learning compatible version of WCCN ● A statistical DL layer, trained end-to-end using SGD with minibatches
  • 27. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Deep Within-Class Covariance Analysis (DWCCA): ● A deep learning compatible version of WCCN ● A statistical DL layer, trained end-to-end using SGD with minibatches ● Can be placed anywhere to reduce the within-class variability
  • 28. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Deep Within-Class Covariance Analysis (DWCCA): ● A deep learning compatible version of WCCN ● A statistical DL layer, trained end-to-end using SGD with minibatches ● Can be placed anywhere to reduce the within-class variability ● B in training is equal to Bb in forward pass
  • 29. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Deep Within-Class Covariance Analysis (DWCCA): ● A deep learning compatible version of WCCN ● A statistical DL layer, trained end-to-end using SGD with minibatches ● Can be placed anywhere to reduce the within-class variability ● B in training is equal to Bb in forward pass ● Gradients wrt B are computed and used in backward pass
  • 30. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Deep Within-Class Covariance Analysis (DWCCA): ● A deep learning compatible version of WCCN ● A statistical DL layer, trained end-to-end using SGD with minibatches ● Can be placed anywhere to reduce the within-class variability ● B in training is equal to Bb in forward pass ● Gradients wrt B are computed and used in backward pass ● A running average is computed for test time (similar to batchnorm)
  • 31. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Deep Within-Class Covariance Analysis (DWCCA): ● A deep learning compatible version of WCCN ● A statistical DL layer, trained end-to-end using SGD with minibatches ● Can be placed anywhere to reduce the within-class variability ● B in training is equal to Bb in forward pass ● Gradients wrt B are computed and used in backward pass ● A running average is computed for test time (similar to batchnorm) ● Compatible with different supervised tasks (Classification, Detection, metric learning...) and data (raw audio...)
  • 32. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Deep Within-Class Covariance Analysis (DWCCA): ● A deep learning compatible version of WCCN ● A statistical DL layer, trained end-to-end using SGD with minibatches ● Can be placed anywhere to reduce the within-class variability ● B in training is equal to Bb in forward pass ● Gradients wrt B are computed and used in backward pass ● A running average is computed for test time (similar to batchnorm) ● Compatible with different supervised tasks (Classification, Detection, metric learning...) and data (raw audio...) ● Can be used with different supervised losses (CCE, BCE, l2 , ...)
  • 33. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Results
  • 34. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Nomismatch Within-Class Covariance Eigenvalue Analysis (Without DWCCA): Train Validation Test Mismatched
  • 35. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Nomismatch Within-Class Covariance Eigenvalue Analysis (With DWCCA): Train Test Mismatched Validation
  • 36. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Nomismatch Eigenvalue Analysis (With vs without DWCCA): Train Test Mismatched Validation
  • 37. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Nomismatch K-NN classification results on VGG representations Validation Test Mismatched
  • 38. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning *: Single model, Single-channel features : Multi-channel features :Ensemble of various models NomismatchMismatched End-to-end classification:
  • 39. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning *: Single model, Single-channel features : Multi-channel features :Ensemble of various models NomismatchMismatched End-to-end classification:
  • 40. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning *: Single model, Single-channel features : Multi-channel features :Ensemble of various models NomismatchMismatched End-to-end classification:
  • 41. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning *: Single model, Single-channel features : Multi-channel features :Ensemble of various models NomismatchMismatched End-to-end classification:
  • 42. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning *: Single model, Single-channel features : Multi-channel features :Ensemble of various models NomismatchMismatched End-to-end classification:
  • 43. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning MismatchedNo mismatch End-to-end class-wise F1:
  • 44. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning MismatchedNo mismatch End-to-end class-wise F1:
  • 45. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Summary
  • 46. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Summary: ● We analysed covariance of the representations in a VGG network Nomismatch Train Test Mismatched Validation
  • 47. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Summary: ● We analysed covariance of the representations in a VGG network ● We showed that the more mismatch there is between training and test, the more within-class variability increases in the representation Nomismatch Train Test Mismatched Validation
  • 48. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Summary: ● We analysed covariance of the representations in a VGG network ● We showed that the more mismatch there is between training and test, the more within-class variability increases in the representation ● We proposed Deep Within-class Covariance Analysis, a deep learning compatible layer capable of significantly reducing within-class variability of a network’s representation
  • 49. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Summary: ● We analysed covariance of the representations in a VGG network ● We showed that the more mismatch there is between training and test, the more within-class variability increases in the representation ● We proposed Deep Within-class Covariance Analysis, a deep learning compatible layer capable of significantly reducing within-class variability of a network’s representation ● We empirically showed that DWCCA improves the generalisation when the training and test have mismatched distributions. Nomismatch Validation Test Mismatched
  • 50. Motivation Covariance Analysis WCCN DWCCA Results Summary Deep Within-Class Covariance Analysis for Robust Deep Audio Representation Learning Thank you for your attention! Come to the poster for more discussions. hamid.eghbal-zadeh@jku.at heghbalz