SlideShare una empresa de Scribd logo
1 de 54
Descargar para leer sin conexión
Audio Source Separation Based on Low-Rank
Structure and Statistical Independence
The University of Tokyo
Research Associate
Daichi Kitamura
Nagoya University, Lecture
May 30, 2017
Introduction
• Daichi Kitamura (北村大地)
• Research Associate of The University of Tokyo
• Academic background
– Kagawa National Collage of Technology (2005 ~ 2012)
• B.S. in Engineering (March 2012)
– Nara Institute of Science and Technology (2012 ~ 2014)
• M.S. in Engineering (March 2014)
– SOKENDAI (2014 ~ 2017)
• Ph.D. in Informatics (March 2017)
• Research topics
– Media signal processing
– Audio source separation
2
Contents
• Research background
– Audio source separation and its applications
– Demonstration
• Structural modeling of audio sources
– Time-frequency representation
– Low-rank modeling of audio spectrogram
– Supervised audio source separation
• Statistical modeling between sources
– Blind audio source separation
– Audio distribution and central limit theorem
– Maximization of independence
• Conclusion and future works
3
Contents
• Research background
– Audio source separation and its applications
– Demonstration
• Structural modeling of audio sources
– Time-frequency representation
– Low-rank modeling of audio spectrogram
– Supervised audio source separation
• Statistical modeling between sources
– Blind audio source separation
– Audio distribution and central limit theorem
– Maximization of independence
• Conclusion and future works
4
• Audio source separation
– Signal processing
– Separation of speech, music sounds, background noise, …
– Cocktail party effect by a computer
Research background
5
• Audio source separation
– Signal processing
– Separation of speech, music sounds, background noise, …
– Cocktail party effect by a computer
Research background
6
Research background
7
Separate
Automatic transcription
CD
• Application of audio source separation
– Hearing aid
• Easy to talk in a loud environment
– Speech recognition systems
• Siri, Google search, Cortana, Amazon Echo, …
– Automatic music transcription
• Musical part separation (Vo., Gt., Ba., …)
– Remix of live-recorded music
• Professional use (improving quality), personal use (DJ remixing), …
Demonstration: speech source separation
• Real-time speech source separation (video)
8
Demonstration: music source separation
• Music source separation
9
Guitar
Vocal
Keyboard
Guitar
Vocal
Keyboard
Source
separation
Pay attention to
listen three parts
in the mixture.
Contents
• Research background
– Audio source separation and its applications
– Demonstration
• Structural modeling of audio sources
– Time-frequency representation
– Low-rank modeling of audio spectrogram
– Supervised audio source separation
• Statistical modeling between sources
– Blind audio source separation
– Audio distribution and central limit theorem
– Maximization of independence
• Conclusion and future works
10
For monaural
signals
For stereo or
multichannel
signals
Contents
• Research background
– Audio source separation and its applications
– Demonstration
• Structural modeling of audio sources
– Time-frequency representation
– Low-rank modeling of audio spectrogram
– Supervised audio source separation
• Statistical modeling between sources
– Blind audio source separation
– Audio distribution and central limit theorem
– Maximization of independence
• Conclusion and future works
11
For monaural
signals
For stereo or
multichannel
signals
Time-frequency representation of audio signals
• Audio waveform in time domain (speech)
12
• Time-varying frequency structure
– Short-time Fourier transform (STFT)
Time-frequency representation of audio signals
13
Time domain
Window
FFT length
Shift length
Time-frequency domain
Waveform
…
Fourier transform
Fourier transform
Fourier transform
Spectrogram
Complex-valued matrix
Frequency
Time
…
Power spectrogram
Nonnegative real-valued matrix
Entry-wise
absolute
and power
Power spectrogram of speech
14
Power spectrogram of music
15
• Sparse (for both speech and music)
– Strong (yellow) components are fewer
– Weak (darker) components are dominant
• Continuous contour (only in speech)
– Spectrum continuously and dynamically changes
• Low rank (especially in music)
– Including similar patterns (similar timbres) many times
Structural properties
16Speech Music
Comparison of low-rankness
17
Drums Guitar
Vocals Speech
• Low-rankness (simplicity of a matrix)
– can be measured by a cumulative singular value (CSV)
– Drums and guitar are quite low-rank
• Also, vocals and speech are to some extent low-rank
– Music spectrogram can be modeled by few patterns
Comparison of low-rankness
18
95% line
7 29 Around 90
Number of bases
when CSV reaches 95%
(Spectrogram size is 1025x1883)
Modeling technique of low-rank structures
• Nonnegative matrix factorization (NMF) [Lee, 1999]
– is a low-rank approximation using limited number of bases
• Bases and their coefficients must be nonnegative
– can be applied to a power spectrogram
• Spectral patterns (typical timbres) and their time-varying gains
19
Amplitude
Amplitude
Nonnegative matrix
(power spectrogram)
Basis matrix
(spectral patterns)
Activation matrix
(time-varying gains)
Time
: # of frequency bins
: # of time frames
: # of bases
Time
Frequency
Frequency
Basis
Activation
• Parameters optimization in NMF
– Minimize “similarity measure” between and
– Arbitrarily measure for similarity can be used
• Squared Euclidian distance , etc.
– Closed form solution is still an open problem
– Iterative calculation can minimize
• Multiplicative update rules [Lee, 2000]
Modeling technique of low-rank structures
20
(for the case of squared Euclidian distance)
Modeling technique of low-rank structures
• Example
21
Pf. and Cl.
Superposition of
rank-1 spectrogram
Modeling technique of low-rank structures
• Example
– Pf. and Cl. are separated!
– Source separation based on NMF
• is a clustering problem of the obtained spectral bases in
– But how?
22
Pf. Cl.
Pf. and Cl.
• If the sourcewise training data is available,
• Supervised NMF [Smaragdis, 2007], [Kitamura1, 2014]
Supervised audio source separation with NMF
23
Separation stage
Training stage
Given
Spectral
dictionary of Pf.
Other bases
Only , , and are optimized
• Demonstration
– Stereo music separation with supervised NMF [Kitamura, 2015]
Supervised audio source separation with NMF
24
Original song
Training
sound of Pf.
Separated
sound (Pf.)
Training
sound of Ba.
Separated
sound (Ba.)
• Performance will be limited
– when the difference of timbres between training data and
target source in the mixture becomes large
Problem of supervised approach
25
Mixture sound
Target Different Pf.
Slightly
different
Training data
60
40
20
0
-20
Amplitude[dB]
3.02.52.01.51.00.50.0
Frequency [kHz]
Real sound
Artificial sound by MIDI
Difference of timbres
Mixture
(actual Pf. & Tb.)
Separated signal
using artificial Pf.
as training data
Supervised
NMF
• Supervised NMF with basis deformation [Kitamura, 2013]
– employs to adaptively deform pre-trained bases in
Adaptive supervised audio source separation
26
Training stage
Deformation term (positive and negative)
Slightly
different
Separation stage
Given
• Constraint in deformation term
– Range of deformation is restricted
– To avoid excess deformation of
Adaptive supervised audio source separation
27
Mixture
(actual Pf. & Tb.)
Separated signal
Supervised
NMF
Separated signal
Supervised NMF with
basis deformation
Training data is the same
(artificial Pf. sound)
Frequency Frequency
±30%
For the case of
• Demonstration
– Separate actual instrumental sounds using artificial training
data produced by MIDI synthesizer.
Adaptive supervised audio source separation
28Copyright © 2014 Yamaha Corp. All rights reserved.
Original song
(actual instruments)
Training sound of Sax.
(produced by MIDI)
Separated sound (Sax.)
Training sound of Ba.
(produced by MIDI)
Separated sound (Ba.)
Residual sound
Residual sound
Contents
• Research background
– Audio source separation and its applications
– Demonstration
• Structural modeling of audio sources
– Time-frequency representation
– Low-rank modeling of audio spectrogram
– Supervised audio source separation
• Statistical modeling between sources
– Blind audio source separation
– Audio distribution and central limit theorem
– Maximization of independence
• Conclusion and future works
29
For monaural
signals
For stereo or
multichannel
signals
Multichannel recording using microphone array
• Number of microphones and sources
– Overdetermined situation (# of sources # of mics.)
– Underdetermined situation (# of sources # of mics.)
• a priori information
– Training data of the source, position of sources, room
geometry, music scores, etc.
– Blind source separation (BSS): without any a priori info. 30
Sources Observed Estimated
Mixing system Demixing system
Microphone array
CD
L-ch
R-ch
Stereo signal (2-ch) One mic.
1-ch
Monaural signal (1-ch)
BSS and independent component analysis
• Blind source separation (BSS)
– Estimate demixing system without any prior information
about the mixing system
• Typical BSS is based on a statistical independence
• Independent component analysis (ICA) [Comon, 1994]
– How to measure a statistical independence?
– Define a “distribution of audio signals”
– Find demixing system that maximizes independence
31
Demixing systemMixing system
What is the distribution of audio signals?
• Distribution of speech waveform
13
Amplitude
Time samples
Spiky and heavy-tailed
than Gaussian (Normal)
distribution
Amountofcomponents
Amplitude
0
0.1
0.2
0.3
0.4
0.5
-5 -4 -3 -2 -1 0 1 2 3 4 5
Gaussian distribution
What is the distribution of audio signals?
• Distribution of Piano waveform
13
Amplitude
Time samples
Spiky and heavy-tailed
than Gaussian distribution
Amountofcomponents
Amplitude
0
0.1
0.2
0.3
0.4
0.5
0.6
-5 -4 -3 -2 -1 0 1 2 3 4 5
Laplace distribution
What is the distribution of audio signals?
• Distribution of Drums waveform
13
Amplitude
Time samples
Spiky and heavy-tailed
than Gaussian distribution
Amountofcomponents
Amplitude
0
0.2
0.4
0.6
0.8
1
-5 -4 -3 -2 -1 0 1 2 3 4 5
Cauchy distribution
Central limit theorem
35
• Audio source distribution is basically non-Gaussian
– But still we don’t know the source distribution
• How to model them for source separation?
• Central limit theorem
– “A sum of any kind of random variables always approaches
to having a Gaussian distribution.”*
• Can’t believe? Let’s see
0
0.1
0.2
0.3
0.4
0.5
0.6
-5 -4 -3 -2 -1 0 1 2 3 4 5
Laplace distribution
0
0.002
0.004
0.006
0.008
0.01
-5 -4 -3 -2 -1 0 1 2 3 4 5
Uniform distribution
Generate r.v.s
Gaussian distribution
0
0.1
0.2
0.3
0.4
0.5
-5 -4 -3 -2 -1 0 1 2 3 4 5
* Several r.v.s do not obey, e.g., Cauchy r.v.
Central limit theorem
36
• is pips of first dice, and is pips of second dice
–
– Probability is always 1/6
• Results of 1 million trials for each dice
– What about ?
Amount
Amount
Central limit theorem
37
• is pips of first dice, and is pips of second dice
–
– Probability is always 1/6
• Results of 1 million trials for each dice
– What about ?
Amount
Not a uniform distribution any more
Central limit theorem
38
• is pips of first dice, and is pips of second dice
–
– Probability is always 1/6
• Results of 1 million trials for each dice
Amount
Amount
Central limit theorem
39
• is pips of first dice, and is pips of second dice
–
– Probability is always 1/6
• Results of 1 million trials for each dice
– Approaches to a Gaussian distribution (central limit theorem)
Central limit theorem in audio signals
40
• is an th speakers signal
–
– , around 3.3 s
Amplitude
Time samples
Amount
Amplitude
Amplitude
Time samples
Amount
Amplitude
Central limit theorem in audio signals
41
• is an th speakers signal
–
– , around 3.3 s
Amplitude
Time samples
AmountAmplitude
Central limit theorem in audio signals
42
• is an th speakers signal
–
– , around 3.3 s
Amplitude
Time samples
Amount
Amplitude
Amplitude
Time samples
Amount
Amplitude
• is an th speakers signal
–
– , around 3.3 s
Central limit theorem in audio signals
43
Amplitude
Time samples
AmountAmplitude
• is an th speakers signal
–
– , around 3.3 s
Central limit theorem in audio signals
44
Amplitude
Time samples
AmountAmplitude
Almost a
Gaussian dist.
(central limit
theorem)
Principle of ICA
45
• What we can say from central limit theorem
– Gaussian distribution is a limitation of mixture of sources
– If we maximize non-Gaussianity of all signals,
the signals will be the original sources before they mixed
Basic principle of ICA
Maximizing
non-Gaussianity
Maximizing
independence
between components
More general,
Approaching to Gaussian
(central limit theorem)
Departing from Gaussian
(ICA)
Principle of ICA
• Assumption in ICA
– 1. Sources are mutually independent
– 2. Each source distribution is non-Gaussian
– 3. Mixing system is invertible and time-invariant
Mixing matrix
Sources
(latent components)1. Mutually
independent
2. Non-Gaussian
3. Invertible and
time-invariant
10
Mixtures
(observed signals)
Inverse matrix
Principle of ICA
• Uncertainty in ICA
– 1. Signal scale (volume) cannot determined
– 2. Signal permutation cannot determined
11
ICA
ICA
Sources
(latent components)
Mixtures
(observed signals)
Sources
(latent components)
Mixtures
(observed signals)
Separated signals
(estimated by ICA)
Separated signals
(estimated by ICA)
• Estimation in ICA
– Maximize independence between source distributions
– log-likelihood function
Principle of ICA
12
Minimize
distance
: Non-Gaussian source distribution
Generally, is set to an appropriate non-Gaussian distribution
• Audio mixture in actual environment
– Convolutive mixture with reverberation
• Ex. office room has 300 ms, concert hall is more than 2000 ms
– Mixing coefficient becomes mixing filter
• How to deconvolute them?
– 1. Estimate deconvolution filter
• In 16 kHz sampling, the filter with 300 ms includes 4800 taps
• # of parameters that should be estimated explodes
– 2. Estimate demixing coefficient in frequency domain
• Frequency-wise demixing matrix should be estimated by ICA
• encountering permutation problem
ICA-based separation of reverberant mixture
49
Reverberation length
(length of convolution filter)
Simultaneous mixture
Convolutive mixture
ICA-based separation of reverberant mixture
• Frequency-domain ICA (FDICA) [Smaragdis, 1998]
– Apply simple ICA to each frequency bin
50
Spectrogram
ICA1
ICA2
ICA3
…
…
ICA
Frequencybin
Time frame
…
Inverse matrix
Frequency-wise
mixing matrix
Frequency-wise
demixing matrix
ICA-based separation of reverberant mixture
51
• Permutation problem in frequency-domain ICA
– Order of separated signals in each frequency is messed up*
– Have to take an alignment through the frequency
*Scales are also messed up, but they can be easily fixed.
ICA
In all frequency
Source 1
Source 2
Mixture 1
Mixture 2
Permutation
Solver
Separated signal 1
Separated signal 2Time
ICA-based separation of reverberant mixture
• Popular permutation solvers
– Based on direction of arrival (DOA)
• Frequency-domain ICA + DOA alignment [Saruwatari, 2006]
– Based on a relative correlation among frequencies
• Independent vector analysis (IVA) [Hiroe, 2006], [Kim, 2006]
– Based on a low-rank modeling of each source
• Independent low-rank matrix analysis (ILRMA) [Kitamura, 2016]
• Demonstration of BSS using ILRMA
– http://d-kitamura.net/en/demo_rank1_en.htm
52
Contents
• Research background
– Audio source separation and its applications
– Demonstration
• Structural modeling of audio sources
– Time-frequency representation
– Low-rank modeling of audio spectrogram
– Supervised audio source separation
• Statistical modeling between sources
– Blind audio source separation
– Audio distribution and central limit theorem
– Maximization of independence
• Conclusion and future works
53
Conclusions and future works
• Audio source separation based on
– Low-rank property
• Nonnegative matrix factorization
– Statistical independence
• Blind source separation
• For further improving
– Separation based on a huge dataset training
• Deep learning, denoising auto encoder, etc.
• Recording condition is juts one-time
– Informed source separation
• Music scores could be a powerful information
• User can induce the system, and leads more accurate separation
• Performance is still insufficient
– Almost there? Not at all! Make our life better. That’s an engineering.
54
Duration
Region

Más contenido relacionado

La actualidad más candente

Multi Carrier Modulation OFDM & FBMC
Multi Carrier Modulation OFDM & FBMCMulti Carrier Modulation OFDM & FBMC
Multi Carrier Modulation OFDM & FBMCVetrivel Chelian
 
Ofdm performance analysis
Ofdm performance analysisOfdm performance analysis
Ofdm performance analysisSaroj Dhakal
 
Single side band and double side band modulation
Single side band and double side band modulationSingle side band and double side band modulation
Single side band and double side band modulationMd. Hasan Al Roktim
 
Deep Generative Models
Deep Generative ModelsDeep Generative Models
Deep Generative ModelsMijung Kim
 
Phydyas 09 fFilter Bank Multicarrier (FBMC): An Integrated Solution to Spectr...
Phydyas 09 fFilter Bank Multicarrier (FBMC): An Integrated Solution to Spectr...Phydyas 09 fFilter Bank Multicarrier (FBMC): An Integrated Solution to Spectr...
Phydyas 09 fFilter Bank Multicarrier (FBMC): An Integrated Solution to Spectr...Marwan Hammouda
 
Contention based MAC protocols
Contention based  MAC protocolsContention based  MAC protocols
Contention based MAC protocolsDarwin Nesakumar
 
Design of FIR Filters
Design of FIR FiltersDesign of FIR Filters
Design of FIR FiltersAranya Sarkar
 
Node level simulators
Node level simulatorsNode level simulators
Node level simulatorsSyedAhamed44
 
Brief Introduction to Boltzmann Machine
Brief Introduction to Boltzmann MachineBrief Introduction to Boltzmann Machine
Brief Introduction to Boltzmann MachineArunabha Saha
 
MIMO Channel Capacity
MIMO Channel CapacityMIMO Channel Capacity
MIMO Channel CapacityPei-Che Chang
 
MIMO Antenna and Technology installation
MIMO Antenna and Technology installationMIMO Antenna and Technology installation
MIMO Antenna and Technology installationDILSHAD AHMAD
 
Applications of digital signal processing
Applications of digital signal processing Applications of digital signal processing
Applications of digital signal processing Rajeev Piyare
 

La actualidad más candente (20)

Mimo
MimoMimo
Mimo
 
Multiple access techniques for wireless communications
Multiple access techniques for wireless communicationsMultiple access techniques for wireless communications
Multiple access techniques for wireless communications
 
MIMO OFDM
MIMO OFDMMIMO OFDM
MIMO OFDM
 
MIMO Calculation
MIMO Calculation MIMO Calculation
MIMO Calculation
 
Multi Carrier Modulation OFDM & FBMC
Multi Carrier Modulation OFDM & FBMCMulti Carrier Modulation OFDM & FBMC
Multi Carrier Modulation OFDM & FBMC
 
Ofdm performance analysis
Ofdm performance analysisOfdm performance analysis
Ofdm performance analysis
 
Single side band and double side band modulation
Single side band and double side band modulationSingle side band and double side band modulation
Single side band and double side band modulation
 
Deep Generative Models
Deep Generative ModelsDeep Generative Models
Deep Generative Models
 
Linear Predictive Coding
Linear Predictive CodingLinear Predictive Coding
Linear Predictive Coding
 
Phydyas 09 fFilter Bank Multicarrier (FBMC): An Integrated Solution to Spectr...
Phydyas 09 fFilter Bank Multicarrier (FBMC): An Integrated Solution to Spectr...Phydyas 09 fFilter Bank Multicarrier (FBMC): An Integrated Solution to Spectr...
Phydyas 09 fFilter Bank Multicarrier (FBMC): An Integrated Solution to Spectr...
 
Vblast
VblastVblast
Vblast
 
Mimo [new]
Mimo [new]Mimo [new]
Mimo [new]
 
Contention based MAC protocols
Contention based  MAC protocolsContention based  MAC protocols
Contention based MAC protocols
 
Assignment Of 5G Antenna Design Technique
Assignment Of 5G Antenna Design TechniqueAssignment Of 5G Antenna Design Technique
Assignment Of 5G Antenna Design Technique
 
Design of FIR Filters
Design of FIR FiltersDesign of FIR Filters
Design of FIR Filters
 
Node level simulators
Node level simulatorsNode level simulators
Node level simulators
 
Brief Introduction to Boltzmann Machine
Brief Introduction to Boltzmann MachineBrief Introduction to Boltzmann Machine
Brief Introduction to Boltzmann Machine
 
MIMO Channel Capacity
MIMO Channel CapacityMIMO Channel Capacity
MIMO Channel Capacity
 
MIMO Antenna and Technology installation
MIMO Antenna and Technology installationMIMO Antenna and Technology installation
MIMO Antenna and Technology installation
 
Applications of digital signal processing
Applications of digital signal processing Applications of digital signal processing
Applications of digital signal processing
 

Destacado

Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...Daichi Kitamura
 
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法Daichi Kitamura
 
Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Daichi Kitamura
 
Relaxation of rank-1 spatial constraint in overdetermined blind source separa...
Relaxation of rank-1 spatial constraint in overdetermined blind source separa...Relaxation of rank-1 spatial constraint in overdetermined blind source separa...
Relaxation of rank-1 spatial constraint in overdetermined blind source separa...Daichi Kitamura
 
音源分離における音響モデリング(Acoustic modeling in audio source separation)
音源分離における音響モデリング(Acoustic modeling in audio source separation)音源分離における音響モデリング(Acoustic modeling in audio source separation)
音源分離における音響モデリング(Acoustic modeling in audio source separation)Daichi Kitamura
 
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)擬似ハムバッキングピックアップの弦振動応答 (in Japanese)
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)Daichi Kitamura
 
Efficient initialization for nonnegative matrix factorization based on nonneg...
Efficient initialization for nonnegative matrix factorization based on nonneg...Efficient initialization for nonnegative matrix factorization based on nonneg...
Efficient initialization for nonnegative matrix factorization based on nonneg...Daichi Kitamura
 
Music signal separation using supervised nonnegative matrix factorization wit...
Music signal separation using supervised nonnegative matrix factorization wit...Music signal separation using supervised nonnegative matrix factorization wit...
Music signal separation using supervised nonnegative matrix factorization wit...Daichi Kitamura
 
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...Daichi Kitamura
 
Study on optimal divergence for superresolution-based supervised nonnegative ...
Study on optimal divergence for superresolution-based supervised nonnegative ...Study on optimal divergence for superresolution-based supervised nonnegative ...
Study on optimal divergence for superresolution-based supervised nonnegative ...Daichi Kitamura
 
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...Daichi Kitamura
 
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法Daichi Kitamura
 
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)Daichi Kitamura
 
非負値行列分解の確率的生成モデルと 多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...
非負値行列分解の確率的生成モデルと多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...非負値行列分解の確率的生成モデルと多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...
非負値行列分解の確率的生成モデルと 多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...Daichi Kitamura
 
統計的独立性と低ランク行列分解理論に基づく ブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
統計的独立性と低ランク行列分解理論に基づく ブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...Daichi Kitamura
 
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...Daichi Kitamura
 
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...Daichi Kitamura
 
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)ICASSP2017読み会(関東編)・AASP_L3(北村担当分)
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)Daichi Kitamura
 

Destacado (18)

Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...
 
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法
 
Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...
 
Relaxation of rank-1 spatial constraint in overdetermined blind source separa...
Relaxation of rank-1 spatial constraint in overdetermined blind source separa...Relaxation of rank-1 spatial constraint in overdetermined blind source separa...
Relaxation of rank-1 spatial constraint in overdetermined blind source separa...
 
音源分離における音響モデリング(Acoustic modeling in audio source separation)
音源分離における音響モデリング(Acoustic modeling in audio source separation)音源分離における音響モデリング(Acoustic modeling in audio source separation)
音源分離における音響モデリング(Acoustic modeling in audio source separation)
 
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)擬似ハムバッキングピックアップの弦振動応答 (in Japanese)
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)
 
Efficient initialization for nonnegative matrix factorization based on nonneg...
Efficient initialization for nonnegative matrix factorization based on nonneg...Efficient initialization for nonnegative matrix factorization based on nonneg...
Efficient initialization for nonnegative matrix factorization based on nonneg...
 
Music signal separation using supervised nonnegative matrix factorization wit...
Music signal separation using supervised nonnegative matrix factorization wit...Music signal separation using supervised nonnegative matrix factorization wit...
Music signal separation using supervised nonnegative matrix factorization wit...
 
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...
 
Study on optimal divergence for superresolution-based supervised nonnegative ...
Study on optimal divergence for superresolution-based supervised nonnegative ...Study on optimal divergence for superresolution-based supervised nonnegative ...
Study on optimal divergence for superresolution-based supervised nonnegative ...
 
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...
 
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
 
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
 
非負値行列分解の確率的生成モデルと 多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...
非負値行列分解の確率的生成モデルと多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...非負値行列分解の確率的生成モデルと多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...
非負値行列分解の確率的生成モデルと 多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...
 
統計的独立性と低ランク行列分解理論に基づく ブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
統計的独立性と低ランク行列分解理論に基づく ブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
 
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
 
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
 
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)ICASSP2017読み会(関東編)・AASP_L3(北村担当分)
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)
 

Similar a Audio Source Separation Based on Low-Rank Structure and Statistical Independence

Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Daichi Kitamura
 
Blind audio source separation based on time-frequency structure models
Blind audio source separation based on time-frequency structure modelsBlind audio source separation based on time-frequency structure models
Blind audio source separation based on time-frequency structure modelsKitamura Laboratory
 
Prior distribution design for music bleeding-sound reduction based on nonnega...
Prior distribution design for music bleeding-sound reduction based on nonnega...Prior distribution design for music bleeding-sound reduction based on nonnega...
Prior distribution design for music bleeding-sound reduction based on nonnega...Kitamura Laboratory
 
Linear multichannel blind source separation based on time-frequency mask obta...
Linear multichannel blind source separation based on time-frequency mask obta...Linear multichannel blind source separation based on time-frequency mask obta...
Linear multichannel blind source separation based on time-frequency mask obta...Kitamura Laboratory
 
Koyama ASA ASJ joint meeting 2016
Koyama ASA ASJ joint meeting 2016Koyama ASA ASJ joint meeting 2016
Koyama ASA ASJ joint meeting 2016SaruwatariLabUTokyo
 
Robust music signal separation based on supervised nonnegative matrix factori...
Robust music signal separation based on supervised nonnegative matrix factori...Robust music signal separation based on supervised nonnegative matrix factori...
Robust music signal separation based on supervised nonnegative matrix factori...Daichi Kitamura
 
PhD Thesis Marius Miron - Source Separation Methods for Orchestral Music
PhD Thesis Marius Miron - Source Separation Methods for Orchestral MusicPhD Thesis Marius Miron - Source Separation Methods for Orchestral Music
PhD Thesis Marius Miron - Source Separation Methods for Orchestral MusicMarius Miron
 
Amplitude spectrogram prediction from mel-frequency cepstrum coefficients and...
Amplitude spectrogram prediction from mel-frequency cepstrum coefficients and...Amplitude spectrogram prediction from mel-frequency cepstrum coefficients and...
Amplitude spectrogram prediction from mel-frequency cepstrum coefficients and...Kitamura Laboratory
 
Music genre detection using hidden markov models
Music genre detection using hidden markov modelsMusic genre detection using hidden markov models
Music genre detection using hidden markov modelsMeghana Kantharaj
 
Deep Learning Meetup #5
Deep Learning Meetup #5Deep Learning Meetup #5
Deep Learning Meetup #5Aloïs Gruson
 
DNN-based frequency-domain permutation solver for multichannel audio source s...
DNN-based frequency-domain permutation solver for multichannel audio source s...DNN-based frequency-domain permutation solver for multichannel audio source s...
DNN-based frequency-domain permutation solver for multichannel audio source s...Kitamura Laboratory
 
Data science-2013-heekim
Data science-2013-heekimData science-2013-heekim
Data science-2013-heekimHaklae Kim
 
A Unified Music Recommender System Using Listening Habits and Semantics of Tags
A Unified Music Recommender System Using Listening Habits and Semantics of TagsA Unified Music Recommender System Using Listening Habits and Semantics of Tags
A Unified Music Recommender System Using Listening Habits and Semantics of Tagsdatasciencekorea
 
IAFPA 2011- 'No Thank You For the Music'
IAFPA 2011- 'No Thank You For the Music' IAFPA 2011- 'No Thank You For the Music'
IAFPA 2011- 'No Thank You For the Music' owrpresentations
 
Online divergence switching for superresolution-based nonnegative matrix fact...
Online divergence switching for superresolution-based nonnegative matrix fact...Online divergence switching for superresolution-based nonnegative matrix fact...
Online divergence switching for superresolution-based nonnegative matrix fact...Daichi Kitamura
 
FMRI medical imagining
FMRI  medical imaginingFMRI  medical imagining
FMRI medical imaginingVishwas N
 
'Music and Noise Fingerprinting and Reference Cancellation Applied to Forensi...
'Music and Noise Fingerprinting and Reference Cancellation Applied to Forensi...'Music and Noise Fingerprinting and Reference Cancellation Applied to Forensi...
'Music and Noise Fingerprinting and Reference Cancellation Applied to Forensi...owrpresentations
 
MLConf2013: Teaching Computer to Listen to Music
MLConf2013: Teaching Computer to Listen to MusicMLConf2013: Teaching Computer to Listen to Music
MLConf2013: Teaching Computer to Listen to MusicEric Battenberg
 
Ml conf2013 teaching_computers_share
Ml conf2013 teaching_computers_shareMl conf2013 teaching_computers_share
Ml conf2013 teaching_computers_shareMLconf
 

Similar a Audio Source Separation Based on Low-Rank Structure and Statistical Independence (20)

Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...
 
Blind audio source separation based on time-frequency structure models
Blind audio source separation based on time-frequency structure modelsBlind audio source separation based on time-frequency structure models
Blind audio source separation based on time-frequency structure models
 
Prior distribution design for music bleeding-sound reduction based on nonnega...
Prior distribution design for music bleeding-sound reduction based on nonnega...Prior distribution design for music bleeding-sound reduction based on nonnega...
Prior distribution design for music bleeding-sound reduction based on nonnega...
 
AMT overview
AMT overviewAMT overview
AMT overview
 
Linear multichannel blind source separation based on time-frequency mask obta...
Linear multichannel blind source separation based on time-frequency mask obta...Linear multichannel blind source separation based on time-frequency mask obta...
Linear multichannel blind source separation based on time-frequency mask obta...
 
Koyama ASA ASJ joint meeting 2016
Koyama ASA ASJ joint meeting 2016Koyama ASA ASJ joint meeting 2016
Koyama ASA ASJ joint meeting 2016
 
Robust music signal separation based on supervised nonnegative matrix factori...
Robust music signal separation based on supervised nonnegative matrix factori...Robust music signal separation based on supervised nonnegative matrix factori...
Robust music signal separation based on supervised nonnegative matrix factori...
 
PhD Thesis Marius Miron - Source Separation Methods for Orchestral Music
PhD Thesis Marius Miron - Source Separation Methods for Orchestral MusicPhD Thesis Marius Miron - Source Separation Methods for Orchestral Music
PhD Thesis Marius Miron - Source Separation Methods for Orchestral Music
 
Amplitude spectrogram prediction from mel-frequency cepstrum coefficients and...
Amplitude spectrogram prediction from mel-frequency cepstrum coefficients and...Amplitude spectrogram prediction from mel-frequency cepstrum coefficients and...
Amplitude spectrogram prediction from mel-frequency cepstrum coefficients and...
 
Music genre detection using hidden markov models
Music genre detection using hidden markov modelsMusic genre detection using hidden markov models
Music genre detection using hidden markov models
 
Deep Learning Meetup #5
Deep Learning Meetup #5Deep Learning Meetup #5
Deep Learning Meetup #5
 
DNN-based frequency-domain permutation solver for multichannel audio source s...
DNN-based frequency-domain permutation solver for multichannel audio source s...DNN-based frequency-domain permutation solver for multichannel audio source s...
DNN-based frequency-domain permutation solver for multichannel audio source s...
 
Data science-2013-heekim
Data science-2013-heekimData science-2013-heekim
Data science-2013-heekim
 
A Unified Music Recommender System Using Listening Habits and Semantics of Tags
A Unified Music Recommender System Using Listening Habits and Semantics of TagsA Unified Music Recommender System Using Listening Habits and Semantics of Tags
A Unified Music Recommender System Using Listening Habits and Semantics of Tags
 
IAFPA 2011- 'No Thank You For the Music'
IAFPA 2011- 'No Thank You For the Music' IAFPA 2011- 'No Thank You For the Music'
IAFPA 2011- 'No Thank You For the Music'
 
Online divergence switching for superresolution-based nonnegative matrix fact...
Online divergence switching for superresolution-based nonnegative matrix fact...Online divergence switching for superresolution-based nonnegative matrix fact...
Online divergence switching for superresolution-based nonnegative matrix fact...
 
FMRI medical imagining
FMRI  medical imaginingFMRI  medical imagining
FMRI medical imagining
 
'Music and Noise Fingerprinting and Reference Cancellation Applied to Forensi...
'Music and Noise Fingerprinting and Reference Cancellation Applied to Forensi...'Music and Noise Fingerprinting and Reference Cancellation Applied to Forensi...
'Music and Noise Fingerprinting and Reference Cancellation Applied to Forensi...
 
MLConf2013: Teaching Computer to Listen to Music
MLConf2013: Teaching Computer to Listen to MusicMLConf2013: Teaching Computer to Listen to Music
MLConf2013: Teaching Computer to Listen to Music
 
Ml conf2013 teaching_computers_share
Ml conf2013 teaching_computers_shareMl conf2013 teaching_computers_share
Ml conf2013 teaching_computers_share
 

Más de Daichi Kitamura

独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...Daichi Kitamura
 
スペクトログラム無矛盾性を用いた 独立低ランク行列分析の実験的評価
スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価
スペクトログラム無矛盾性を用いた 独立低ランク行列分析の実験的評価Daichi Kitamura
 
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)Daichi Kitamura
 
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...Daichi Kitamura
 
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...Daichi Kitamura
 
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...Daichi Kitamura
 
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)Daichi Kitamura
 
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)Daichi Kitamura
 
Evaluation of separation accuracy for various real instruments based on super...
Evaluation of separation accuracy for various real instruments based on super...Evaluation of separation accuracy for various real instruments based on super...
Evaluation of separation accuracy for various real instruments based on super...Daichi Kitamura
 
Divergence optimization based on trade-off between separation and extrapolati...
Divergence optimization based on trade-off between separation and extrapolati...Divergence optimization based on trade-off between separation and extrapolati...
Divergence optimization based on trade-off between separation and extrapolati...Daichi Kitamura
 
Depth estimation of sound images using directional clustering and activation-...
Depth estimation of sound images using directional clustering and activation-...Depth estimation of sound images using directional clustering and activation-...
Depth estimation of sound images using directional clustering and activation-...Daichi Kitamura
 

Más de Daichi Kitamura (11)

独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
 
スペクトログラム無矛盾性を用いた 独立低ランク行列分析の実験的評価
スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価
スペクトログラム無矛盾性を用いた 独立低ランク行列分析の実験的評価
 
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
 
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
 
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
 
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
 
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)
 
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)
 
Evaluation of separation accuracy for various real instruments based on super...
Evaluation of separation accuracy for various real instruments based on super...Evaluation of separation accuracy for various real instruments based on super...
Evaluation of separation accuracy for various real instruments based on super...
 
Divergence optimization based on trade-off between separation and extrapolati...
Divergence optimization based on trade-off between separation and extrapolati...Divergence optimization based on trade-off between separation and extrapolati...
Divergence optimization based on trade-off between separation and extrapolati...
 
Depth estimation of sound images using directional clustering and activation-...
Depth estimation of sound images using directional clustering and activation-...Depth estimation of sound images using directional clustering and activation-...
Depth estimation of sound images using directional clustering and activation-...
 

Último

Human brain.. It's parts and function.
Human brain.. It's parts and function. Human brain.. It's parts and function.
Human brain.. It's parts and function. MUKTA MANJARI SAHOO
 
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdfPests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdfPirithiRaju
 
Bureau of Indian Standards Specification of Shampoo.pptx
Bureau of Indian Standards Specification of Shampoo.pptxBureau of Indian Standards Specification of Shampoo.pptx
Bureau of Indian Standards Specification of Shampoo.pptxkastureyashashree
 
Basic Concepts in Pharmacology in molecular .pptx
Basic Concepts in Pharmacology in molecular  .pptxBasic Concepts in Pharmacology in molecular  .pptx
Basic Concepts in Pharmacology in molecular .pptxVijayaKumarR28
 
Role of Herbs in Cosmetics in Cosmetic Science.
Role of Herbs in Cosmetics in Cosmetic Science.Role of Herbs in Cosmetics in Cosmetic Science.
Role of Herbs in Cosmetics in Cosmetic Science.ShwetaHattimare
 
Application of Foraminiferal Ecology- Rahul.pptx
Application of Foraminiferal Ecology- Rahul.pptxApplication of Foraminiferal Ecology- Rahul.pptx
Application of Foraminiferal Ecology- Rahul.pptxRahulVishwakarma71547
 
TORSION IN GASTROPODS- Anatomical event (Zoology)
TORSION IN GASTROPODS- Anatomical event (Zoology)TORSION IN GASTROPODS- Anatomical event (Zoology)
TORSION IN GASTROPODS- Anatomical event (Zoology)chatterjeesoumili50
 
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky Way
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky WayShiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky Way
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky WaySérgio Sacani
 
Pests of tenai_Identification,Binomics_Dr.UPR
Pests of tenai_Identification,Binomics_Dr.UPRPests of tenai_Identification,Binomics_Dr.UPR
Pests of tenai_Identification,Binomics_Dr.UPRPirithiRaju
 
Lehninger_Chapter 17_Fatty acid Oxid.ppt
Lehninger_Chapter 17_Fatty acid Oxid.pptLehninger_Chapter 17_Fatty acid Oxid.ppt
Lehninger_Chapter 17_Fatty acid Oxid.pptSachin Teotia
 
RCPE terms and cycles scenarios as of March 2024
RCPE terms and cycles scenarios as of March 2024RCPE terms and cycles scenarios as of March 2024
RCPE terms and cycles scenarios as of March 2024suelcarter1
 
Substances in Common Use for Shahu College Screening Test
Substances in Common Use for Shahu College Screening TestSubstances in Common Use for Shahu College Screening Test
Substances in Common Use for Shahu College Screening TestAkashDTejwani
 
IB Biology New syllabus B3.2 Transport.pptx
IB Biology New syllabus B3.2 Transport.pptxIB Biology New syllabus B3.2 Transport.pptx
IB Biology New syllabus B3.2 Transport.pptxUalikhanKalkhojayev1
 
Pests of ragi_Identification, Binomics_Dr.UPR
Pests of ragi_Identification, Binomics_Dr.UPRPests of ragi_Identification, Binomics_Dr.UPR
Pests of ragi_Identification, Binomics_Dr.UPRPirithiRaju
 
Krishi Vigyan Kendras - कृषि विज्ञान केंद्र
Krishi Vigyan Kendras - कृषि विज्ञान केंद्रKrishi Vigyan Kendras - कृषि विज्ञान केंद्र
Krishi Vigyan Kendras - कृषि विज्ञान केंद्रKrashi Coaching
 
biosynthesis of the cell wall and antibiotics
biosynthesis of the cell wall and antibioticsbiosynthesis of the cell wall and antibiotics
biosynthesis of the cell wall and antibioticsSafaFallah
 
Role of herbs in hair care Amla and heena.pptx
Role of herbs in hair care  Amla and  heena.pptxRole of herbs in hair care  Amla and  heena.pptx
Role of herbs in hair care Amla and heena.pptxVaishnaviAware
 
Contracts with Interdependent Preferences (2)
Contracts with Interdependent Preferences (2)Contracts with Interdependent Preferences (2)
Contracts with Interdependent Preferences (2)GRAPE
 
Alternative system of medicine herbal drug technology syllabus
Alternative system of medicine herbal drug technology syllabusAlternative system of medicine herbal drug technology syllabus
Alternative system of medicine herbal drug technology syllabusPradnya Wadekar
 

Último (20)

Human brain.. It's parts and function.
Human brain.. It's parts and function. Human brain.. It's parts and function.
Human brain.. It's parts and function.
 
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdfPests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
 
Bureau of Indian Standards Specification of Shampoo.pptx
Bureau of Indian Standards Specification of Shampoo.pptxBureau of Indian Standards Specification of Shampoo.pptx
Bureau of Indian Standards Specification of Shampoo.pptx
 
Basic Concepts in Pharmacology in molecular .pptx
Basic Concepts in Pharmacology in molecular  .pptxBasic Concepts in Pharmacology in molecular  .pptx
Basic Concepts in Pharmacology in molecular .pptx
 
Cheminformatics tools and chemistry data underpinning mass spectrometry analy...
Cheminformatics tools and chemistry data underpinning mass spectrometry analy...Cheminformatics tools and chemistry data underpinning mass spectrometry analy...
Cheminformatics tools and chemistry data underpinning mass spectrometry analy...
 
Role of Herbs in Cosmetics in Cosmetic Science.
Role of Herbs in Cosmetics in Cosmetic Science.Role of Herbs in Cosmetics in Cosmetic Science.
Role of Herbs in Cosmetics in Cosmetic Science.
 
Application of Foraminiferal Ecology- Rahul.pptx
Application of Foraminiferal Ecology- Rahul.pptxApplication of Foraminiferal Ecology- Rahul.pptx
Application of Foraminiferal Ecology- Rahul.pptx
 
TORSION IN GASTROPODS- Anatomical event (Zoology)
TORSION IN GASTROPODS- Anatomical event (Zoology)TORSION IN GASTROPODS- Anatomical event (Zoology)
TORSION IN GASTROPODS- Anatomical event (Zoology)
 
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky Way
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky WayShiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky Way
Shiva and Shakti: Presumed Proto-Galactic Fragments in the Inner Milky Way
 
Pests of tenai_Identification,Binomics_Dr.UPR
Pests of tenai_Identification,Binomics_Dr.UPRPests of tenai_Identification,Binomics_Dr.UPR
Pests of tenai_Identification,Binomics_Dr.UPR
 
Lehninger_Chapter 17_Fatty acid Oxid.ppt
Lehninger_Chapter 17_Fatty acid Oxid.pptLehninger_Chapter 17_Fatty acid Oxid.ppt
Lehninger_Chapter 17_Fatty acid Oxid.ppt
 
RCPE terms and cycles scenarios as of March 2024
RCPE terms and cycles scenarios as of March 2024RCPE terms and cycles scenarios as of March 2024
RCPE terms and cycles scenarios as of March 2024
 
Substances in Common Use for Shahu College Screening Test
Substances in Common Use for Shahu College Screening TestSubstances in Common Use for Shahu College Screening Test
Substances in Common Use for Shahu College Screening Test
 
IB Biology New syllabus B3.2 Transport.pptx
IB Biology New syllabus B3.2 Transport.pptxIB Biology New syllabus B3.2 Transport.pptx
IB Biology New syllabus B3.2 Transport.pptx
 
Pests of ragi_Identification, Binomics_Dr.UPR
Pests of ragi_Identification, Binomics_Dr.UPRPests of ragi_Identification, Binomics_Dr.UPR
Pests of ragi_Identification, Binomics_Dr.UPR
 
Krishi Vigyan Kendras - कृषि विज्ञान केंद्र
Krishi Vigyan Kendras - कृषि विज्ञान केंद्रKrishi Vigyan Kendras - कृषि विज्ञान केंद्र
Krishi Vigyan Kendras - कृषि विज्ञान केंद्र
 
biosynthesis of the cell wall and antibiotics
biosynthesis of the cell wall and antibioticsbiosynthesis of the cell wall and antibiotics
biosynthesis of the cell wall and antibiotics
 
Role of herbs in hair care Amla and heena.pptx
Role of herbs in hair care  Amla and  heena.pptxRole of herbs in hair care  Amla and  heena.pptx
Role of herbs in hair care Amla and heena.pptx
 
Contracts with Interdependent Preferences (2)
Contracts with Interdependent Preferences (2)Contracts with Interdependent Preferences (2)
Contracts with Interdependent Preferences (2)
 
Alternative system of medicine herbal drug technology syllabus
Alternative system of medicine herbal drug technology syllabusAlternative system of medicine herbal drug technology syllabus
Alternative system of medicine herbal drug technology syllabus
 

Audio Source Separation Based on Low-Rank Structure and Statistical Independence

  • 1. Audio Source Separation Based on Low-Rank Structure and Statistical Independence The University of Tokyo Research Associate Daichi Kitamura Nagoya University, Lecture May 30, 2017
  • 2. Introduction • Daichi Kitamura (北村大地) • Research Associate of The University of Tokyo • Academic background – Kagawa National Collage of Technology (2005 ~ 2012) • B.S. in Engineering (March 2012) – Nara Institute of Science and Technology (2012 ~ 2014) • M.S. in Engineering (March 2014) – SOKENDAI (2014 ~ 2017) • Ph.D. in Informatics (March 2017) • Research topics – Media signal processing – Audio source separation 2
  • 3. Contents • Research background – Audio source separation and its applications – Demonstration • Structural modeling of audio sources – Time-frequency representation – Low-rank modeling of audio spectrogram – Supervised audio source separation • Statistical modeling between sources – Blind audio source separation – Audio distribution and central limit theorem – Maximization of independence • Conclusion and future works 3
  • 4. Contents • Research background – Audio source separation and its applications – Demonstration • Structural modeling of audio sources – Time-frequency representation – Low-rank modeling of audio spectrogram – Supervised audio source separation • Statistical modeling between sources – Blind audio source separation – Audio distribution and central limit theorem – Maximization of independence • Conclusion and future works 4
  • 5. • Audio source separation – Signal processing – Separation of speech, music sounds, background noise, … – Cocktail party effect by a computer Research background 5
  • 6. • Audio source separation – Signal processing – Separation of speech, music sounds, background noise, … – Cocktail party effect by a computer Research background 6
  • 7. Research background 7 Separate Automatic transcription CD • Application of audio source separation – Hearing aid • Easy to talk in a loud environment – Speech recognition systems • Siri, Google search, Cortana, Amazon Echo, … – Automatic music transcription • Musical part separation (Vo., Gt., Ba., …) – Remix of live-recorded music • Professional use (improving quality), personal use (DJ remixing), …
  • 8. Demonstration: speech source separation • Real-time speech source separation (video) 8
  • 9. Demonstration: music source separation • Music source separation 9 Guitar Vocal Keyboard Guitar Vocal Keyboard Source separation Pay attention to listen three parts in the mixture.
  • 10. Contents • Research background – Audio source separation and its applications – Demonstration • Structural modeling of audio sources – Time-frequency representation – Low-rank modeling of audio spectrogram – Supervised audio source separation • Statistical modeling between sources – Blind audio source separation – Audio distribution and central limit theorem – Maximization of independence • Conclusion and future works 10 For monaural signals For stereo or multichannel signals
  • 11. Contents • Research background – Audio source separation and its applications – Demonstration • Structural modeling of audio sources – Time-frequency representation – Low-rank modeling of audio spectrogram – Supervised audio source separation • Statistical modeling between sources – Blind audio source separation – Audio distribution and central limit theorem – Maximization of independence • Conclusion and future works 11 For monaural signals For stereo or multichannel signals
  • 12. Time-frequency representation of audio signals • Audio waveform in time domain (speech) 12
  • 13. • Time-varying frequency structure – Short-time Fourier transform (STFT) Time-frequency representation of audio signals 13 Time domain Window FFT length Shift length Time-frequency domain Waveform … Fourier transform Fourier transform Fourier transform Spectrogram Complex-valued matrix Frequency Time … Power spectrogram Nonnegative real-valued matrix Entry-wise absolute and power
  • 14. Power spectrogram of speech 14
  • 16. • Sparse (for both speech and music) – Strong (yellow) components are fewer – Weak (darker) components are dominant • Continuous contour (only in speech) – Spectrum continuously and dynamically changes • Low rank (especially in music) – Including similar patterns (similar timbres) many times Structural properties 16Speech Music
  • 17. Comparison of low-rankness 17 Drums Guitar Vocals Speech
  • 18. • Low-rankness (simplicity of a matrix) – can be measured by a cumulative singular value (CSV) – Drums and guitar are quite low-rank • Also, vocals and speech are to some extent low-rank – Music spectrogram can be modeled by few patterns Comparison of low-rankness 18 95% line 7 29 Around 90 Number of bases when CSV reaches 95% (Spectrogram size is 1025x1883)
  • 19. Modeling technique of low-rank structures • Nonnegative matrix factorization (NMF) [Lee, 1999] – is a low-rank approximation using limited number of bases • Bases and their coefficients must be nonnegative – can be applied to a power spectrogram • Spectral patterns (typical timbres) and their time-varying gains 19 Amplitude Amplitude Nonnegative matrix (power spectrogram) Basis matrix (spectral patterns) Activation matrix (time-varying gains) Time : # of frequency bins : # of time frames : # of bases Time Frequency Frequency Basis Activation
  • 20. • Parameters optimization in NMF – Minimize “similarity measure” between and – Arbitrarily measure for similarity can be used • Squared Euclidian distance , etc. – Closed form solution is still an open problem – Iterative calculation can minimize • Multiplicative update rules [Lee, 2000] Modeling technique of low-rank structures 20 (for the case of squared Euclidian distance)
  • 21. Modeling technique of low-rank structures • Example 21 Pf. and Cl. Superposition of rank-1 spectrogram
  • 22. Modeling technique of low-rank structures • Example – Pf. and Cl. are separated! – Source separation based on NMF • is a clustering problem of the obtained spectral bases in – But how? 22 Pf. Cl. Pf. and Cl.
  • 23. • If the sourcewise training data is available, • Supervised NMF [Smaragdis, 2007], [Kitamura1, 2014] Supervised audio source separation with NMF 23 Separation stage Training stage Given Spectral dictionary of Pf. Other bases Only , , and are optimized
  • 24. • Demonstration – Stereo music separation with supervised NMF [Kitamura, 2015] Supervised audio source separation with NMF 24 Original song Training sound of Pf. Separated sound (Pf.) Training sound of Ba. Separated sound (Ba.)
  • 25. • Performance will be limited – when the difference of timbres between training data and target source in the mixture becomes large Problem of supervised approach 25 Mixture sound Target Different Pf. Slightly different Training data 60 40 20 0 -20 Amplitude[dB] 3.02.52.01.51.00.50.0 Frequency [kHz] Real sound Artificial sound by MIDI Difference of timbres Mixture (actual Pf. & Tb.) Separated signal using artificial Pf. as training data Supervised NMF
  • 26. • Supervised NMF with basis deformation [Kitamura, 2013] – employs to adaptively deform pre-trained bases in Adaptive supervised audio source separation 26 Training stage Deformation term (positive and negative) Slightly different Separation stage Given
  • 27. • Constraint in deformation term – Range of deformation is restricted – To avoid excess deformation of Adaptive supervised audio source separation 27 Mixture (actual Pf. & Tb.) Separated signal Supervised NMF Separated signal Supervised NMF with basis deformation Training data is the same (artificial Pf. sound) Frequency Frequency ±30% For the case of
  • 28. • Demonstration – Separate actual instrumental sounds using artificial training data produced by MIDI synthesizer. Adaptive supervised audio source separation 28Copyright © 2014 Yamaha Corp. All rights reserved. Original song (actual instruments) Training sound of Sax. (produced by MIDI) Separated sound (Sax.) Training sound of Ba. (produced by MIDI) Separated sound (Ba.) Residual sound Residual sound
  • 29. Contents • Research background – Audio source separation and its applications – Demonstration • Structural modeling of audio sources – Time-frequency representation – Low-rank modeling of audio spectrogram – Supervised audio source separation • Statistical modeling between sources – Blind audio source separation – Audio distribution and central limit theorem – Maximization of independence • Conclusion and future works 29 For monaural signals For stereo or multichannel signals
  • 30. Multichannel recording using microphone array • Number of microphones and sources – Overdetermined situation (# of sources # of mics.) – Underdetermined situation (# of sources # of mics.) • a priori information – Training data of the source, position of sources, room geometry, music scores, etc. – Blind source separation (BSS): without any a priori info. 30 Sources Observed Estimated Mixing system Demixing system Microphone array CD L-ch R-ch Stereo signal (2-ch) One mic. 1-ch Monaural signal (1-ch)
  • 31. BSS and independent component analysis • Blind source separation (BSS) – Estimate demixing system without any prior information about the mixing system • Typical BSS is based on a statistical independence • Independent component analysis (ICA) [Comon, 1994] – How to measure a statistical independence? – Define a “distribution of audio signals” – Find demixing system that maximizes independence 31 Demixing systemMixing system
  • 32. What is the distribution of audio signals? • Distribution of speech waveform 13 Amplitude Time samples Spiky and heavy-tailed than Gaussian (Normal) distribution Amountofcomponents Amplitude 0 0.1 0.2 0.3 0.4 0.5 -5 -4 -3 -2 -1 0 1 2 3 4 5 Gaussian distribution
  • 33. What is the distribution of audio signals? • Distribution of Piano waveform 13 Amplitude Time samples Spiky and heavy-tailed than Gaussian distribution Amountofcomponents Amplitude 0 0.1 0.2 0.3 0.4 0.5 0.6 -5 -4 -3 -2 -1 0 1 2 3 4 5 Laplace distribution
  • 34. What is the distribution of audio signals? • Distribution of Drums waveform 13 Amplitude Time samples Spiky and heavy-tailed than Gaussian distribution Amountofcomponents Amplitude 0 0.2 0.4 0.6 0.8 1 -5 -4 -3 -2 -1 0 1 2 3 4 5 Cauchy distribution
  • 35. Central limit theorem 35 • Audio source distribution is basically non-Gaussian – But still we don’t know the source distribution • How to model them for source separation? • Central limit theorem – “A sum of any kind of random variables always approaches to having a Gaussian distribution.”* • Can’t believe? Let’s see 0 0.1 0.2 0.3 0.4 0.5 0.6 -5 -4 -3 -2 -1 0 1 2 3 4 5 Laplace distribution 0 0.002 0.004 0.006 0.008 0.01 -5 -4 -3 -2 -1 0 1 2 3 4 5 Uniform distribution Generate r.v.s Gaussian distribution 0 0.1 0.2 0.3 0.4 0.5 -5 -4 -3 -2 -1 0 1 2 3 4 5 * Several r.v.s do not obey, e.g., Cauchy r.v.
  • 36. Central limit theorem 36 • is pips of first dice, and is pips of second dice – – Probability is always 1/6 • Results of 1 million trials for each dice – What about ? Amount Amount
  • 37. Central limit theorem 37 • is pips of first dice, and is pips of second dice – – Probability is always 1/6 • Results of 1 million trials for each dice – What about ? Amount Not a uniform distribution any more
  • 38. Central limit theorem 38 • is pips of first dice, and is pips of second dice – – Probability is always 1/6 • Results of 1 million trials for each dice Amount Amount
  • 39. Central limit theorem 39 • is pips of first dice, and is pips of second dice – – Probability is always 1/6 • Results of 1 million trials for each dice – Approaches to a Gaussian distribution (central limit theorem)
  • 40. Central limit theorem in audio signals 40 • is an th speakers signal – – , around 3.3 s Amplitude Time samples Amount Amplitude Amplitude Time samples Amount Amplitude
  • 41. Central limit theorem in audio signals 41 • is an th speakers signal – – , around 3.3 s Amplitude Time samples AmountAmplitude
  • 42. Central limit theorem in audio signals 42 • is an th speakers signal – – , around 3.3 s Amplitude Time samples Amount Amplitude Amplitude Time samples Amount Amplitude
  • 43. • is an th speakers signal – – , around 3.3 s Central limit theorem in audio signals 43 Amplitude Time samples AmountAmplitude
  • 44. • is an th speakers signal – – , around 3.3 s Central limit theorem in audio signals 44 Amplitude Time samples AmountAmplitude Almost a Gaussian dist. (central limit theorem)
  • 45. Principle of ICA 45 • What we can say from central limit theorem – Gaussian distribution is a limitation of mixture of sources – If we maximize non-Gaussianity of all signals, the signals will be the original sources before they mixed Basic principle of ICA Maximizing non-Gaussianity Maximizing independence between components More general, Approaching to Gaussian (central limit theorem) Departing from Gaussian (ICA)
  • 46. Principle of ICA • Assumption in ICA – 1. Sources are mutually independent – 2. Each source distribution is non-Gaussian – 3. Mixing system is invertible and time-invariant Mixing matrix Sources (latent components)1. Mutually independent 2. Non-Gaussian 3. Invertible and time-invariant 10 Mixtures (observed signals) Inverse matrix
  • 47. Principle of ICA • Uncertainty in ICA – 1. Signal scale (volume) cannot determined – 2. Signal permutation cannot determined 11 ICA ICA Sources (latent components) Mixtures (observed signals) Sources (latent components) Mixtures (observed signals) Separated signals (estimated by ICA) Separated signals (estimated by ICA)
  • 48. • Estimation in ICA – Maximize independence between source distributions – log-likelihood function Principle of ICA 12 Minimize distance : Non-Gaussian source distribution Generally, is set to an appropriate non-Gaussian distribution
  • 49. • Audio mixture in actual environment – Convolutive mixture with reverberation • Ex. office room has 300 ms, concert hall is more than 2000 ms – Mixing coefficient becomes mixing filter • How to deconvolute them? – 1. Estimate deconvolution filter • In 16 kHz sampling, the filter with 300 ms includes 4800 taps • # of parameters that should be estimated explodes – 2. Estimate demixing coefficient in frequency domain • Frequency-wise demixing matrix should be estimated by ICA • encountering permutation problem ICA-based separation of reverberant mixture 49 Reverberation length (length of convolution filter) Simultaneous mixture Convolutive mixture
  • 50. ICA-based separation of reverberant mixture • Frequency-domain ICA (FDICA) [Smaragdis, 1998] – Apply simple ICA to each frequency bin 50 Spectrogram ICA1 ICA2 ICA3 … … ICA Frequencybin Time frame … Inverse matrix Frequency-wise mixing matrix Frequency-wise demixing matrix
  • 51. ICA-based separation of reverberant mixture 51 • Permutation problem in frequency-domain ICA – Order of separated signals in each frequency is messed up* – Have to take an alignment through the frequency *Scales are also messed up, but they can be easily fixed. ICA In all frequency Source 1 Source 2 Mixture 1 Mixture 2 Permutation Solver Separated signal 1 Separated signal 2Time
  • 52. ICA-based separation of reverberant mixture • Popular permutation solvers – Based on direction of arrival (DOA) • Frequency-domain ICA + DOA alignment [Saruwatari, 2006] – Based on a relative correlation among frequencies • Independent vector analysis (IVA) [Hiroe, 2006], [Kim, 2006] – Based on a low-rank modeling of each source • Independent low-rank matrix analysis (ILRMA) [Kitamura, 2016] • Demonstration of BSS using ILRMA – http://d-kitamura.net/en/demo_rank1_en.htm 52
  • 53. Contents • Research background – Audio source separation and its applications – Demonstration • Structural modeling of audio sources – Time-frequency representation – Low-rank modeling of audio spectrogram – Supervised audio source separation • Statistical modeling between sources – Blind audio source separation – Audio distribution and central limit theorem – Maximization of independence • Conclusion and future works 53
  • 54. Conclusions and future works • Audio source separation based on – Low-rank property • Nonnegative matrix factorization – Statistical independence • Blind source separation • For further improving – Separation based on a huge dataset training • Deep learning, denoising auto encoder, etc. • Recording condition is juts one-time – Informed source separation • Music scores could be a powerful information • User can induce the system, and leads more accurate separation • Performance is still insufficient – Almost there? Not at all! Make our life better. That’s an engineering. 54 Duration Region