Nonnegative matrix factorization and the principle of the common cause
By: E. Khalafyan, A. E. Allahverdyan, A. Hovhannisyan
Potential Business Impact:
Finds hidden patterns in pictures, even with noise.
Nonnegative matrix factorization (NMF) is a known unsupervised data-reduction method. The principle of the common cause (PCC) is a basic methodological approach in probabilistic causality, which seeks an independent mixture model for the joint probability of two dependent random variables. It turns out that these two concepts are closely related. This relationship is explored reciprocally for several datasets of gray-scale images, which are conveniently mapped into probability models. On one hand, PCC provides a predictability tool that leads to a robust estimation of the effective rank of NMF. Unlike other estimates (e.g., those based on the Bayesian Information Criteria), our estimate of the rank is stable against weak noise. We show that NMF implemented around this rank produces features (basis images) that are also stable against noise and against seeds of local optimization, thereby effectively resolving the NMF nonidentifiability problem. On the other hand, NMF provides an interesting possibility of implementing PCC in an approximate way, where larger and positively correlated joint probabilities tend to be explained better via the independent mixture model. We work out a clustering method, where data points with the same common cause are grouped into the same cluster. We also show how NMF can be employed for data denoising.
Similar Papers
Applying non-negative matrix factorization with covariates to label matrix for classification
Machine Learning (CS)
Helps computers guess what something is.
Non-Negative Matrix Factorization Using Non-Von Neumann Computers
Quantum Physics
New computer helps solve hard math problems faster.
Highly robust factored principal component analysis for matrix-valued outlier accommodation and explainable detection via matrix minimum covariance determinant
Methodology
Finds bad data points in complex pictures.