Subset selection for matrices in spectral norm
By: Ivan Kozyrev, Alexander Osinsky
Potential Business Impact:
Finds best data columns for faster computer math.
We address the subset selection problem for matrices, where the goal is to select a subset of $k$ columns from a "short-and-fat" matrix $X \in \mathbb{R}^{m \times n}$, such that the pseudoinverse of the sampled submatrix has as small spectral or Frobenius norm as possible. For the NP-hard spectral norm variant, we propose a new deterministic approximation algorithm. Our method refines the potential-based framework of spectral sparsification by specializing it to a single barrier function. This key modification enables direct, unweighted column selection, bypassing the intermediate weighting step required by previous approaches. It also allows for a novel adaptive update strategy for the barrier. This approach yields a new, explicit bound on the approximation quality that improves upon existing guarantees in key parameter regimes, without increasing the asymptotic computational complexity. Furthermore, numerical experiments demonstrate that the proposed method consistently outperforms its direct competitors. A complete C++ implementation is provided to support our findings and facilitate future research.
Similar Papers
Many (most?) column subset selection criteria are NP hard
Numerical Analysis
Finds the most important parts of big numbers.
New Lower Bounds for the Minimum Singular Value in Matrix Selection
Functional Analysis
Finds best data pieces for better math.
Efficient QR-based Column Subset Selection through Randomized Sparse Embeddings
Numerical Analysis
Finds important data in huge spreadsheets faster.