SlideShare una empresa de Scribd logo
1 de 26
Descargar para leer sin conexión
.
.
. ..
.
.
Introduction to Blind Source Separation and Non-negative
Matrix Factorization
Tatsuya Yokota
Tokyo Institute of Technology
November 29, 2011
November 29, 2011 1/26
Outline
.
..1 Blind Source Separation
.
..2 Non-negative Matrix Factrization
.
..3 Experiments
.
..4 Summary
November 29, 2011 2/26
What’s a Blind Source Separation
Blind Source Separation is a method to estimate original signals from observed
signals which consist of mixed original signals and noise.
November 29, 2011 3/26
Example of BSS
BSS is often used for Speech analysis and Image analysis.
November 29, 2011 4/26
Example of BSS (cont’d)
BSS is also very important for brain signal analysis.
November 29, 2011 5/26
Model Formalization
The problem of BSS is formalized as follow:
The matrix
X ∈ Rm×d
(1)
denotes original signals, where m is number of original signals, and d is dimension
of one signal.
We consider that the observed signals Y ∈ Rn×d
are given by linear mixing system
as
Y = AX + E, (2)
where A ∈ Rn×m
is the unknown mixing matrix and E ∈ Rn×d
denotes a noise.
Basically, n ≥ m.
The goal of BSS is to estimate ˆA and ˆX so that ˆX provides unknown original
signal as possible.
November 29, 2011 6/26
Kinds of BSS Methods
Actually, degree of freedom of BSS model is very high to estimate A and X.
Bcause there are a huge number of combinations (A, X) which satisfy
Y = AX + E.
Therefore, we need some constraint to solve the BSS problem such as:
PCA : orthogonal constraint
SCA : sparsity constraint
NMF : non-negativity constraint
ICA : independency constraint
In this way, there are many methods to solve the BSS problem depending on the
constraints. What we use is depend on subject matter.
I will introduce about Non-negative Matrix Factrization(NMF) based
technique for BSS.
November 29, 2011 7/26
Non-negativity
Non-negativity means that value is 0 or positive (i.e. at least 0). Non-negative
matrix is a matrix which its all elements are at least 0. For example
1 10
−10 1
/∈ R2×2
+ ,
1 10
0 1
∈ R2×2
+ . (3)
November 29, 2011 8/26
Why nonnegativity constraints?
This is the key-point of this method.
We can say that because many real-world data are nonnegative and the
corresponding hidden components have a physical meaning only when
nonnegative.
Example of nonnegative data,
image data (luminance)
power spectol of signal (speech, EEG etc.)
In general, observed data can be considered as a linear combination of such
nonnegative data with noise.(i.e. Y = AX + E)
November 29, 2011 9/26
Standard NMF Model
Consider the standard NMF model, given by:
Y = AX + E, (4)
A ≥ 0, X ≥ 0. (5)
For simply, we denote A ≥ 0 as nonnegativity of matrix A.
The NMF problem is given by:
.
NMF problem
..
.
. ..
.
.
minimize
1
2
||Y − AX||2
F (6)
subject to A ≥ 0, X ≥ 0 (7)
I will introduce the alternating least square(ALS) algorithm to solve this
problem.
November 29, 2011 10/26
ALS algorithm
Our goal is to estimate ( ˆA, ˆX) = minA,X
1
2 ||Y − AX||2
F , s.t. A ≥ 0, X ≥ 0.
However, it is very difficult to solve such a multivariate problem.
The ALS strategy is to solve following sub-problems alternately.
.
sub-problem for A
..
.
. ..
.
.
min
1
2
||Y − AX||2
F (8)
s.t. A ≥ 0 (9)
.
sub-problem for X
..
.
. ..
.
.
min
1
2
||Y − AX||2
F (10)
s.t. X ≥ 0 (11)
November 29, 2011 11/26
How to Solve the sub-problem
Solving procedure of sub-problem for A is only two steps as follow:
.
sub-procedure for A
..
.
. ..
.
.
...1 A ← argminA
1
2 ||Y − AX||2
F
...2 A ← [A]+
where [a]+ = max(ε, a) rectify to enforce nonnegativity.
In a similar way, we can solve sub-problem for X as follow:
.
sub-prodecure for X
..
.
. ..
.
.
...1 X ← argminX
1
2 ||Y − AX||2
F
...2 X ← [X]+
November 29, 2011 12/26
Solution of Least Squares
Here, I show basic calculation of least squares. First, objective function can be
transformed as
1
2
||Y − AX||2
F (12)
=
1
2
tr(Y − AX)(Y − AX)T
(13)
=
1
2
tr(Y Y T
− 2AXY T
+ AXXT
AT
) (14)
The stationary points ˆA can be found by equating the gradient components to 0:
∂
∂AT
1
2
tr(Y Y T
− 2AXY T
+ AXXT
AT
) (15)
= −XY T
+ XXT
AT
= 0 (16)
Therefore,
ˆAT
= (XXT
)−1
XY T
. (17)
November 29, 2011 13/26
Solution of Least Squares (cont’d)
In similar way, we can also obtain a solution of X. The objective function can be
transformed as
1
2
||Y − AX||2
F (18)
=
1
2
tr(Y − AX)T
(Y − AX) (19)
=
1
2
tr(Y T
Y − 2XT
AT
Y + XT
AT
AX) (20)
The stationary points ˆX can be found by equating the gradient components to 0:
∂
∂X
1
2
tr(Y T
Y − 2XT
AT
Y + XT
AT
AX) (21)
= −AT
Y + AT
AX = 0 (22)
Therefore,
ˆX = (AT
A)−1
AT
Y. (23)
November 29, 2011 14/26
Implimentation
Implimentation in “octave” is very simple:
i=0;
while 1
i++;
printf("%d: %f n",i,sqrt(sum(sum(E.^2))));
if sqrt(sum(sum(E.^2))) < er || i > maxiter
break;
end
At=(X*X’+ 1e-5*eye(J))(X*Y’);A=At’;
A(A < 0)=1e-16;
X=(A’*A + 1e-5*eye(J))(A’*Y);
X(X < 0)=1e-16;
E=Y-A*X;
end
November 29, 2011 15/26
Experiments: Artificial Data 1
-0.5
0
0.5
1
1.5
0 50 100 150 200
(a) sin2(t)
-0.5
0
0.5
1
1.5
0 50 100 150 200
(b) cos2(t)
Figure: Original Signals
-0.5
0
0.5
1
1.5
0 50 100 150 200
(a) mixed (observed) signal 1
-0.5
0
0.5
1
1.5
0 50 100 150 200
(b) mixed (observed) signal 2
Figure: Observed Signals
-0.5
0
0.5
1
1.5
0 50 100 150 200
(a) estimated signal 1
-0.5
0
0.5
1
1.5
0 50 100 150 200
(b) estimated signal 2
Figure: Estimated Signals
We can see that results are almost separated.
November 29, 2011 16/26
Experiments: Artificial Data 2
-0.5
0
0.5
1
1.5
0 50 100 150 200
(a) sin2(t)
-0.5
0
0.5
1
1.5
0 50 100 150 200
(b) sin2(2t)
Figure: Original Signals
-0.5
0
0.5
1
1.5
0 50 100 150 200
(a) mixed (observed) signal 1
-0.5
0
0.5
1
1.5
0 50 100 150 200
(b) mixed (observed) signal 2
Figure: Observed Signals
-0.5
0
0.5
1
1.5
0 50 100 150 200
(a) estimated signal 1
-0.5
0
0.5
1
1.5
0 50 100 150 200
(b) estimated signal 2
Figure: Estimated Signals
We can see that results are not so separated.
November 29, 2011 17/26
Experiments: Real Image 1
(a) newyork
(b) shanghai
Figure: Original Signals
(a) mixed (observed) signal 1
(b) mixed (observed) signal 2
Figure: Observed Signals
(a) estimated signal 1
(b) estimated signal 2
Figure: Estimated Signals
We can see that results are almost separated.
November 29, 2011 18/26
Experiments: Real Image 2
(a) rock
(b) pig
Figure: Original Signals
(a) mixed (observed) signal 1
(b) mixed (observed) signal 2
Figure: Observed Signals
(a) estimated signal 1
(b) estimated signal 2
Figure: Estimated Signals
We can see that estimate 2 is not so separated.
November 29, 2011 19/26
Experiments: Real Image 3
(a) nyc (b) sha
(c) rock (d) pig
(e) obs1 (f) obs2
(g) obs3 (h) obs4
Figure: Ori. & Obs.
(a) estimated signal 1 (b) estimated signal 2
(c) estimated signal 3 (d) estimated signal 4
Figure: Estimated Signals: Results are not good.
Signals are still mixed.
November 29, 2011 20/26
Issues
There are some issues of this method as follow:
Solution is non-unique (i.e. result is depend on starting point)
It is not enough only by non-negativity constraint (shoud consider
independency of each components)
When number of estimate signals is large, features of components are
overlapped.
And we should consider more general issues of BSS as follow:
Number of original signals is unknown (How can we decide its number?)
November 29, 2011 21/26
About this research area
In this research area, many method for BSS are studied and proposed as follow:
KL-divergence based NMF [Honkela et al., 2011]
Alpha-Beta divergences based NMF [Cichocki et al., 2011]
Non-Gaussianity based ICA [Hyv¨arinen et al., 2001]
Kurtosis based ICA
Negentropy based ICA
Solving method for ICA
gradient method
fast fixed-point algorithm [Hyv¨arinen and Oja, 1997]
MLE based ICA
Mutual information based ICA
Non-linear ICA
Tensor ICA
In this way, this research area is very broad!!
November 29, 2011 22/26
Summary
I introduced about BSS problem and basic NMF tequnique from
[Cichocki et al., 2009].
EEG classification is very difficult problem.
.
Future Work
..
.
. ..
.
.
To study about extension of NMF and Independent Component Analysis
(ICA).
To apply the BSS problem, EEG analysis, and some pattern recognition
problem.
November 29, 2011 23/26
Bibliography I
[Cichocki et al., 2011] Cichocki, A., Cruces, S., and Amari, S.-i. (2011).
Generalized alpha-beta divergences and their application to robust nonnegative
matrix factorization.
Entropy, 13(1):134–170.
[Cichocki et al., 2009] Cichocki, A., Zdunek, R., Phan, A. H., and Amari, S.
(2009).
Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory
Multi-way Data Analysis.
Wiley.
[Honkela et al., 2011] Honkela, T., Duch, W., Girolami, M., and Kaski, S.,
editors (2011).
Kullback-Leibler Divergence for Nonnegative Matrix Factorization, volume
6791 of Lecture Notes in Computer Science. Springer Berlin / Heidelberg.
[Hyv¨arinen et al., 2001] Hyv¨arinen, A., Karhunen, J., and Oja, E. (2001).
Independent Component Analysis.
Wiley.
November 29, 2011 24/26
Bibliography II
[Hyv¨arinen and Oja, 1997] Hyv¨arinen, A. and Oja, E. (1997).
A fast fixed-point algorithm for independent component analysis.
Neural Computation, 9:1483–1492.
November 29, 2011 25/26
Thank you for listening
November 29, 2011 26/26

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

boosting algorithm
boosting algorithmboosting algorithm
boosting algorithm
 
A brief introduction to mutual information and its application
A brief introduction to mutual information and its applicationA brief introduction to mutual information and its application
A brief introduction to mutual information and its application
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational Autoencoder
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
 
Lecture 6: Ensemble Methods
Lecture 6: Ensemble Methods Lecture 6: Ensemble Methods
Lecture 6: Ensemble Methods
 
Representation learning on graphs
Representation learning on graphsRepresentation learning on graphs
Representation learning on graphs
 
Generative Adversarial Networks and Their Medical Imaging Applications
Generative Adversarial Networks and Their Medical Imaging ApplicationsGenerative Adversarial Networks and Their Medical Imaging Applications
Generative Adversarial Networks and Their Medical Imaging Applications
 
Handling concept drift in data stream mining
Handling concept drift in data stream miningHandling concept drift in data stream mining
Handling concept drift in data stream mining
 
Attention mechanism 소개 자료
Attention mechanism 소개 자료Attention mechanism 소개 자료
Attention mechanism 소개 자료
 
Deep Learning Frameworks slides
Deep Learning Frameworks slides Deep Learning Frameworks slides
Deep Learning Frameworks slides
 
Autoencoders in Deep Learning
Autoencoders in Deep LearningAutoencoders in Deep Learning
Autoencoders in Deep Learning
 
(Machine Learning) Ensemble learning
(Machine Learning) Ensemble learning (Machine Learning) Ensemble learning
(Machine Learning) Ensemble learning
 
Independent Component Analysis
Independent Component AnalysisIndependent Component Analysis
Independent Component Analysis
 
Deep Learning for Video: Action Recognition (UPC 2018)
Deep Learning for Video: Action Recognition (UPC 2018)Deep Learning for Video: Action Recognition (UPC 2018)
Deep Learning for Video: Action Recognition (UPC 2018)
 
Ensemble methods
Ensemble methods Ensemble methods
Ensemble methods
 
Ensemble learning
Ensemble learningEnsemble learning
Ensemble learning
 
Clustering
ClusteringClustering
Clustering
 
Unsupervised learning clustering
Unsupervised learning clusteringUnsupervised learning clustering
Unsupervised learning clustering
 
Outlier detection method introduction
Outlier detection method introductionOutlier detection method introduction
Outlier detection method introduction
 

Destacado

Factorization Machines with libFM
Factorization Machines with libFMFactorization Machines with libFM
Factorization Machines with libFM
Liangjie Hong
 
Matrix factorization
Matrix factorizationMatrix factorization
Matrix factorization
rubyyc
 
Lecture 6 lu factorization & determinants - section 2-5 2-7 3-1 and 3-2
Lecture 6   lu factorization & determinants - section 2-5 2-7 3-1 and 3-2Lecture 6   lu factorization & determinants - section 2-5 2-7 3-1 and 3-2
Lecture 6 lu factorization & determinants - section 2-5 2-7 3-1 and 3-2
njit-ronbrown
 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introduction
Liang Xiang
 
Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engine
NYC Predictive Analytics
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architecture
Liang Xiang
 

Destacado (16)

Matrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender SystemsMatrix Factorization Technique for Recommender Systems
Matrix Factorization Technique for Recommender Systems
 
Factorization Machines with libFM
Factorization Machines with libFMFactorization Machines with libFM
Factorization Machines with libFM
 
Matrix factorization
Matrix factorizationMatrix factorization
Matrix factorization
 
Collaborative Filtering with Spark
Collaborative Filtering with SparkCollaborative Filtering with Spark
Collaborative Filtering with Spark
 
Neighbor methods vs matrix factorization - case studies of real-life recommen...
Neighbor methods vs matrix factorization - case studies of real-life recommen...Neighbor methods vs matrix factorization - case studies of real-life recommen...
Neighbor methods vs matrix factorization - case studies of real-life recommen...
 
Intro to Factorization Machines
Intro to Factorization MachinesIntro to Factorization Machines
Intro to Factorization Machines
 
Lecture 6 lu factorization & determinants - section 2-5 2-7 3-1 and 3-2
Lecture 6   lu factorization & determinants - section 2-5 2-7 3-1 and 3-2Lecture 6   lu factorization & determinants - section 2-5 2-7 3-1 and 3-2
Lecture 6 lu factorization & determinants - section 2-5 2-7 3-1 and 3-2
 
آموزش محاسبات عددی - بخش دوم
آموزش محاسبات عددی - بخش دومآموزش محاسبات عددی - بخش دوم
آموزش محاسبات عددی - بخش دوم
 
Recommender system introduction
Recommender system   introductionRecommender system   introduction
Recommender system introduction
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Introduction to Matrix Factorization Methods Collaborative Filtering
Introduction to Matrix Factorization Methods Collaborative FilteringIntroduction to Matrix Factorization Methods Collaborative Filtering
Introduction to Matrix Factorization Methods Collaborative Filtering
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Recommendation system
Recommendation system Recommendation system
Recommendation system
 
Collaborative Filtering Recommendation System
Collaborative Filtering Recommendation SystemCollaborative Filtering Recommendation System
Collaborative Filtering Recommendation System
 
Building a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engineBuilding a Recommendation Engine - An example of a product recommendation engine
Building a Recommendation Engine - An example of a product recommendation engine
 
Recommender system algorithm and architecture
Recommender system algorithm and architectureRecommender system algorithm and architecture
Recommender system algorithm and architecture
 

Similar a Nonnegative Matrix Factorization

2012 mdsp pr10 ica
2012 mdsp pr10 ica2012 mdsp pr10 ica
2012 mdsp pr10 ica
nozomuhamada
 
Dimension Reduction Introduction & PCA.pptx
Dimension Reduction Introduction & PCA.pptxDimension Reduction Introduction & PCA.pptx
Dimension Reduction Introduction & PCA.pptx
RohanBorgalli
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
ijceronline
 

Similar a Nonnegative Matrix Factorization (20)

2012 mdsp pr10 ica
2012 mdsp pr10 ica2012 mdsp pr10 ica
2012 mdsp pr10 ica
 
ICAoverview
ICAoverviewICAoverview
ICAoverview
 
Single Channel Speech De-noising Using Kernel Independent Component Analysis...
	Single Channel Speech De-noising Using Kernel Independent Component Analysis...	Single Channel Speech De-noising Using Kernel Independent Component Analysis...
Single Channel Speech De-noising Using Kernel Independent Component Analysis...
 
Section9 stochastic
Section9 stochasticSection9 stochastic
Section9 stochastic
 
Blind Audio Source Separation (Bass): An Unsuperwised Approach
Blind Audio Source Separation (Bass): An Unsuperwised Approach Blind Audio Source Separation (Bass): An Unsuperwised Approach
Blind Audio Source Separation (Bass): An Unsuperwised Approach
 
Independent Component Analysis
Independent Component Analysis Independent Component Analysis
Independent Component Analysis
 
Principal Component Analysis for Tensor Analysis and EEG classification
Principal Component Analysis for Tensor Analysis and EEG classificationPrincipal Component Analysis for Tensor Analysis and EEG classification
Principal Component Analysis for Tensor Analysis and EEG classification
 
MATHEMATICAL EXPLANATION TO SOLUTION FOR EX-NOR PROBLEM USING MLFFN
MATHEMATICAL EXPLANATION TO SOLUTION FOR EX-NOR PROBLEM USING MLFFNMATHEMATICAL EXPLANATION TO SOLUTION FOR EX-NOR PROBLEM USING MLFFN
MATHEMATICAL EXPLANATION TO SOLUTION FOR EX-NOR PROBLEM USING MLFFN
 
Dimension Reduction Introduction & PCA.pptx
Dimension Reduction Introduction & PCA.pptxDimension Reduction Introduction & PCA.pptx
Dimension Reduction Introduction & PCA.pptx
 
Mixed Spectra for Stable Signals from Discrete Observations
Mixed Spectra for Stable Signals from Discrete ObservationsMixed Spectra for Stable Signals from Discrete Observations
Mixed Spectra for Stable Signals from Discrete Observations
 
Mixed Spectra for Stable Signals from Discrete Observations
Mixed Spectra for Stable Signals from Discrete ObservationsMixed Spectra for Stable Signals from Discrete Observations
Mixed Spectra for Stable Signals from Discrete Observations
 
MIXED SPECTRA FOR STABLE SIGNALS FROM DISCRETE OBSERVATIONS
MIXED SPECTRA FOR STABLE SIGNALS FROM DISCRETE OBSERVATIONSMIXED SPECTRA FOR STABLE SIGNALS FROM DISCRETE OBSERVATIONS
MIXED SPECTRA FOR STABLE SIGNALS FROM DISCRETE OBSERVATIONS
 
Mixed Spectra for Stable Signals from Discrete Observations
Mixed Spectra for Stable Signals from Discrete ObservationsMixed Spectra for Stable Signals from Discrete Observations
Mixed Spectra for Stable Signals from Discrete Observations
 
Mixed Spectra for Stable Signals from Discrete Observations
Mixed Spectra for Stable Signals from Discrete ObservationsMixed Spectra for Stable Signals from Discrete Observations
Mixed Spectra for Stable Signals from Discrete Observations
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
13005810.ppt
13005810.ppt13005810.ppt
13005810.ppt
 
40120130406002
4012013040600240120130406002
40120130406002
 
DSP_DiscSignals_LinearS_150417.pptx
DSP_DiscSignals_LinearS_150417.pptxDSP_DiscSignals_LinearS_150417.pptx
DSP_DiscSignals_LinearS_150417.pptx
 
StatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdf
StatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdfStatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdf
StatPhysPerspectives_AMALEA_Cetraro_AnnaCarbone.pdf
 

Más de Tatsuya Yokota

Linked CP Tensor Decomposition (presented by ICONIP2012)
Linked CP Tensor Decomposition (presented by ICONIP2012)Linked CP Tensor Decomposition (presented by ICONIP2012)
Linked CP Tensor Decomposition (presented by ICONIP2012)
Tatsuya Yokota
 
Introduction to Common Spatial Pattern Filters for EEG Motor Imagery Classifi...
Introduction to Common Spatial Pattern Filters for EEG Motor Imagery Classifi...Introduction to Common Spatial Pattern Filters for EEG Motor Imagery Classifi...
Introduction to Common Spatial Pattern Filters for EEG Motor Imagery Classifi...
Tatsuya Yokota
 

Más de Tatsuya Yokota (9)

Tensor representations in signal processing and machine learning (tutorial ta...
Tensor representations in signal processing and machine learning (tutorial ta...Tensor representations in signal processing and machine learning (tutorial ta...
Tensor representations in signal processing and machine learning (tutorial ta...
 
テンソル多重線形ランクの推定法について(Estimation of Multi-linear Tensor Rank)
テンソル多重線形ランクの推定法について(Estimation of Multi-linear Tensor Rank)テンソル多重線形ランクの推定法について(Estimation of Multi-linear Tensor Rank)
テンソル多重線形ランクの推定法について(Estimation of Multi-linear Tensor Rank)
 
自己相似な情報モデリング
自己相似な情報モデリング自己相似な情報モデリング
自己相似な情報モデリング
 
Priorに基づく画像/テンソルの復元
Priorに基づく画像/テンソルの復元Priorに基づく画像/テンソルの復元
Priorに基づく画像/テンソルの復元
 
低ランク性および平滑性を用いたテンソル補完 (Tensor Completion based on Low-rank and Smooth Structu...
低ランク性および平滑性を用いたテンソル補完 (Tensor Completion based on Low-rank and Smooth Structu...低ランク性および平滑性を用いたテンソル補完 (Tensor Completion based on Low-rank and Smooth Structu...
低ランク性および平滑性を用いたテンソル補完 (Tensor Completion based on Low-rank and Smooth Structu...
 
2013 11 01(fast_grbf-nmf)_for_share
2013 11 01(fast_grbf-nmf)_for_share2013 11 01(fast_grbf-nmf)_for_share
2013 11 01(fast_grbf-nmf)_for_share
 
2014 3 13(テンソル分解の基礎)
2014 3 13(テンソル分解の基礎)2014 3 13(テンソル分解の基礎)
2014 3 13(テンソル分解の基礎)
 
Linked CP Tensor Decomposition (presented by ICONIP2012)
Linked CP Tensor Decomposition (presented by ICONIP2012)Linked CP Tensor Decomposition (presented by ICONIP2012)
Linked CP Tensor Decomposition (presented by ICONIP2012)
 
Introduction to Common Spatial Pattern Filters for EEG Motor Imagery Classifi...
Introduction to Common Spatial Pattern Filters for EEG Motor Imagery Classifi...Introduction to Common Spatial Pattern Filters for EEG Motor Imagery Classifi...
Introduction to Common Spatial Pattern Filters for EEG Motor Imagery Classifi...
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Último (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 

Nonnegative Matrix Factorization

  • 1. . . . .. . . Introduction to Blind Source Separation and Non-negative Matrix Factorization Tatsuya Yokota Tokyo Institute of Technology November 29, 2011 November 29, 2011 1/26
  • 2. Outline . ..1 Blind Source Separation . ..2 Non-negative Matrix Factrization . ..3 Experiments . ..4 Summary November 29, 2011 2/26
  • 3. What’s a Blind Source Separation Blind Source Separation is a method to estimate original signals from observed signals which consist of mixed original signals and noise. November 29, 2011 3/26
  • 4. Example of BSS BSS is often used for Speech analysis and Image analysis. November 29, 2011 4/26
  • 5. Example of BSS (cont’d) BSS is also very important for brain signal analysis. November 29, 2011 5/26
  • 6. Model Formalization The problem of BSS is formalized as follow: The matrix X ∈ Rm×d (1) denotes original signals, where m is number of original signals, and d is dimension of one signal. We consider that the observed signals Y ∈ Rn×d are given by linear mixing system as Y = AX + E, (2) where A ∈ Rn×m is the unknown mixing matrix and E ∈ Rn×d denotes a noise. Basically, n ≥ m. The goal of BSS is to estimate ˆA and ˆX so that ˆX provides unknown original signal as possible. November 29, 2011 6/26
  • 7. Kinds of BSS Methods Actually, degree of freedom of BSS model is very high to estimate A and X. Bcause there are a huge number of combinations (A, X) which satisfy Y = AX + E. Therefore, we need some constraint to solve the BSS problem such as: PCA : orthogonal constraint SCA : sparsity constraint NMF : non-negativity constraint ICA : independency constraint In this way, there are many methods to solve the BSS problem depending on the constraints. What we use is depend on subject matter. I will introduce about Non-negative Matrix Factrization(NMF) based technique for BSS. November 29, 2011 7/26
  • 8. Non-negativity Non-negativity means that value is 0 or positive (i.e. at least 0). Non-negative matrix is a matrix which its all elements are at least 0. For example 1 10 −10 1 /∈ R2×2 + , 1 10 0 1 ∈ R2×2 + . (3) November 29, 2011 8/26
  • 9. Why nonnegativity constraints? This is the key-point of this method. We can say that because many real-world data are nonnegative and the corresponding hidden components have a physical meaning only when nonnegative. Example of nonnegative data, image data (luminance) power spectol of signal (speech, EEG etc.) In general, observed data can be considered as a linear combination of such nonnegative data with noise.(i.e. Y = AX + E) November 29, 2011 9/26
  • 10. Standard NMF Model Consider the standard NMF model, given by: Y = AX + E, (4) A ≥ 0, X ≥ 0. (5) For simply, we denote A ≥ 0 as nonnegativity of matrix A. The NMF problem is given by: . NMF problem .. . . .. . . minimize 1 2 ||Y − AX||2 F (6) subject to A ≥ 0, X ≥ 0 (7) I will introduce the alternating least square(ALS) algorithm to solve this problem. November 29, 2011 10/26
  • 11. ALS algorithm Our goal is to estimate ( ˆA, ˆX) = minA,X 1 2 ||Y − AX||2 F , s.t. A ≥ 0, X ≥ 0. However, it is very difficult to solve such a multivariate problem. The ALS strategy is to solve following sub-problems alternately. . sub-problem for A .. . . .. . . min 1 2 ||Y − AX||2 F (8) s.t. A ≥ 0 (9) . sub-problem for X .. . . .. . . min 1 2 ||Y − AX||2 F (10) s.t. X ≥ 0 (11) November 29, 2011 11/26
  • 12. How to Solve the sub-problem Solving procedure of sub-problem for A is only two steps as follow: . sub-procedure for A .. . . .. . . ...1 A ← argminA 1 2 ||Y − AX||2 F ...2 A ← [A]+ where [a]+ = max(ε, a) rectify to enforce nonnegativity. In a similar way, we can solve sub-problem for X as follow: . sub-prodecure for X .. . . .. . . ...1 X ← argminX 1 2 ||Y − AX||2 F ...2 X ← [X]+ November 29, 2011 12/26
  • 13. Solution of Least Squares Here, I show basic calculation of least squares. First, objective function can be transformed as 1 2 ||Y − AX||2 F (12) = 1 2 tr(Y − AX)(Y − AX)T (13) = 1 2 tr(Y Y T − 2AXY T + AXXT AT ) (14) The stationary points ˆA can be found by equating the gradient components to 0: ∂ ∂AT 1 2 tr(Y Y T − 2AXY T + AXXT AT ) (15) = −XY T + XXT AT = 0 (16) Therefore, ˆAT = (XXT )−1 XY T . (17) November 29, 2011 13/26
  • 14. Solution of Least Squares (cont’d) In similar way, we can also obtain a solution of X. The objective function can be transformed as 1 2 ||Y − AX||2 F (18) = 1 2 tr(Y − AX)T (Y − AX) (19) = 1 2 tr(Y T Y − 2XT AT Y + XT AT AX) (20) The stationary points ˆX can be found by equating the gradient components to 0: ∂ ∂X 1 2 tr(Y T Y − 2XT AT Y + XT AT AX) (21) = −AT Y + AT AX = 0 (22) Therefore, ˆX = (AT A)−1 AT Y. (23) November 29, 2011 14/26
  • 15. Implimentation Implimentation in “octave” is very simple: i=0; while 1 i++; printf("%d: %f n",i,sqrt(sum(sum(E.^2)))); if sqrt(sum(sum(E.^2))) < er || i > maxiter break; end At=(X*X’+ 1e-5*eye(J))(X*Y’);A=At’; A(A < 0)=1e-16; X=(A’*A + 1e-5*eye(J))(A’*Y); X(X < 0)=1e-16; E=Y-A*X; end November 29, 2011 15/26
  • 16. Experiments: Artificial Data 1 -0.5 0 0.5 1 1.5 0 50 100 150 200 (a) sin2(t) -0.5 0 0.5 1 1.5 0 50 100 150 200 (b) cos2(t) Figure: Original Signals -0.5 0 0.5 1 1.5 0 50 100 150 200 (a) mixed (observed) signal 1 -0.5 0 0.5 1 1.5 0 50 100 150 200 (b) mixed (observed) signal 2 Figure: Observed Signals -0.5 0 0.5 1 1.5 0 50 100 150 200 (a) estimated signal 1 -0.5 0 0.5 1 1.5 0 50 100 150 200 (b) estimated signal 2 Figure: Estimated Signals We can see that results are almost separated. November 29, 2011 16/26
  • 17. Experiments: Artificial Data 2 -0.5 0 0.5 1 1.5 0 50 100 150 200 (a) sin2(t) -0.5 0 0.5 1 1.5 0 50 100 150 200 (b) sin2(2t) Figure: Original Signals -0.5 0 0.5 1 1.5 0 50 100 150 200 (a) mixed (observed) signal 1 -0.5 0 0.5 1 1.5 0 50 100 150 200 (b) mixed (observed) signal 2 Figure: Observed Signals -0.5 0 0.5 1 1.5 0 50 100 150 200 (a) estimated signal 1 -0.5 0 0.5 1 1.5 0 50 100 150 200 (b) estimated signal 2 Figure: Estimated Signals We can see that results are not so separated. November 29, 2011 17/26
  • 18. Experiments: Real Image 1 (a) newyork (b) shanghai Figure: Original Signals (a) mixed (observed) signal 1 (b) mixed (observed) signal 2 Figure: Observed Signals (a) estimated signal 1 (b) estimated signal 2 Figure: Estimated Signals We can see that results are almost separated. November 29, 2011 18/26
  • 19. Experiments: Real Image 2 (a) rock (b) pig Figure: Original Signals (a) mixed (observed) signal 1 (b) mixed (observed) signal 2 Figure: Observed Signals (a) estimated signal 1 (b) estimated signal 2 Figure: Estimated Signals We can see that estimate 2 is not so separated. November 29, 2011 19/26
  • 20. Experiments: Real Image 3 (a) nyc (b) sha (c) rock (d) pig (e) obs1 (f) obs2 (g) obs3 (h) obs4 Figure: Ori. & Obs. (a) estimated signal 1 (b) estimated signal 2 (c) estimated signal 3 (d) estimated signal 4 Figure: Estimated Signals: Results are not good. Signals are still mixed. November 29, 2011 20/26
  • 21. Issues There are some issues of this method as follow: Solution is non-unique (i.e. result is depend on starting point) It is not enough only by non-negativity constraint (shoud consider independency of each components) When number of estimate signals is large, features of components are overlapped. And we should consider more general issues of BSS as follow: Number of original signals is unknown (How can we decide its number?) November 29, 2011 21/26
  • 22. About this research area In this research area, many method for BSS are studied and proposed as follow: KL-divergence based NMF [Honkela et al., 2011] Alpha-Beta divergences based NMF [Cichocki et al., 2011] Non-Gaussianity based ICA [Hyv¨arinen et al., 2001] Kurtosis based ICA Negentropy based ICA Solving method for ICA gradient method fast fixed-point algorithm [Hyv¨arinen and Oja, 1997] MLE based ICA Mutual information based ICA Non-linear ICA Tensor ICA In this way, this research area is very broad!! November 29, 2011 22/26
  • 23. Summary I introduced about BSS problem and basic NMF tequnique from [Cichocki et al., 2009]. EEG classification is very difficult problem. . Future Work .. . . .. . . To study about extension of NMF and Independent Component Analysis (ICA). To apply the BSS problem, EEG analysis, and some pattern recognition problem. November 29, 2011 23/26
  • 24. Bibliography I [Cichocki et al., 2011] Cichocki, A., Cruces, S., and Amari, S.-i. (2011). Generalized alpha-beta divergences and their application to robust nonnegative matrix factorization. Entropy, 13(1):134–170. [Cichocki et al., 2009] Cichocki, A., Zdunek, R., Phan, A. H., and Amari, S. (2009). Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis. Wiley. [Honkela et al., 2011] Honkela, T., Duch, W., Girolami, M., and Kaski, S., editors (2011). Kullback-Leibler Divergence for Nonnegative Matrix Factorization, volume 6791 of Lecture Notes in Computer Science. Springer Berlin / Heidelberg. [Hyv¨arinen et al., 2001] Hyv¨arinen, A., Karhunen, J., and Oja, E. (2001). Independent Component Analysis. Wiley. November 29, 2011 24/26
  • 25. Bibliography II [Hyv¨arinen and Oja, 1997] Hyv¨arinen, A. and Oja, E. (1997). A fast fixed-point algorithm for independent component analysis. Neural Computation, 9:1483–1492. November 29, 2011 25/26
  • 26. Thank you for listening November 29, 2011 26/26