A Globally Optimal Analytic Solution for Semi-Nonnegative Matrix Factorization with Nonnegative or Mixed Inputs
By: Lu Chenggang
Potential Business Impact:
Finds better patterns in mixed-sign data.
Semi-Nonnegative Matrix Factorization (semi-NMF) extends classical Nonnegative Matrix Factorization (NMF) by allowing the basis matrix to contain both positive and negative entries, making it suitable for decomposing data with mixed signs. However, most existing semi-NMF algorithms are iterative, non-convex, and prone to local minima. In this paper, we propose a novel method that yields a globally optimal solution to the semi-NMF problem under the Frobenius norm, through an orthogonal decomposition derived from the scatter matrix of the input data. We rigorously prove that our solution attains the global minimum of the reconstruction error. Furthermore, we demonstrate that when the input matrix is nonnegative, our method often achieves lower reconstruction error than standard NMF algorithms, although unfortunately the basis matrix may not satisfy nonnegativity. In particular, in low-rank cases such as rank 1 or 2, our solution reduces exactly to a nonnegative factorization, recovering the NMF structure. We validate our approach through experiments on both synthetic data and the UCI Wine dataset, showing that our method consistently outperforms existing NMF and semi-NMF methods in terms of reconstruction accuracy. These results confirm that our globally optimal, non-iterative formulation offers both theoretical guarantees and empirical advantages, providing a new perspective on matrix factorization in optimization and data analysis.
Similar Papers
A Provably-Correct and Robust Convex Model for Smooth Separable NMF
Numerical Analysis
Finds hidden patterns in data, even with noise.
Non-Negative Matrix Factorization Using Non-Von Neumann Computers
Quantum Physics
New computer helps solve hard math problems faster.
Robustness of Minimum-Volume Nonnegative Matrix Factorization under an Expanded Sufficiently Scattered Condition
Machine Learning (Stat)
Makes computer analysis work better even with messy data.