Computational lower bounds in latent models: clustering, sparse-clustering, biclustering
By: Bertrand Even, Christophe Giraud, Nicolas Verzelen
Potential Business Impact:
Finds limits of computer problem-solving abilities.
In many high-dimensional problems, like sparse-PCA, planted clique, or clustering, the best known algorithms with polynomial time complexity fail to reach the statistical performance provably achievable by algorithms free of computational constraints. This observation has given rise to the conjecture of the existence, for some problems, of gaps -- so called statistical-computational gaps -- between the best possible statistical performance achievable without computational constraints, and the best performance achievable with poly-time algorithms. A powerful approach to assess the best performance achievable in poly-time is to investigate the best performance achievable by polynomials with low-degree. We build on the seminal paper of Schramm and Wein (2022) and propose a new scheme to derive lower bounds on the performance of low-degree polynomials in some latent space models. By better leveraging the latent structures, we obtain new and sharper results, with simplified proofs. We then instantiate our scheme to provide computational lower bounds for the problems of clustering, sparse clustering, and biclustering. We also prove matching upper-bounds and some additional statistical results, in order to provide a comprehensive description of the statistical-computational gaps occurring in these three problems.
Similar Papers
Computational Complexity of Statistics: New Insights from Low-Degree Polynomials
Statistics Theory
Makes hard computer problems easier to solve.
Almost-Optimal Local-Search Methods for Sparse Tensor PCA
Statistics Theory
Makes computers find patterns in data better.
Statistical-computational gap in multiple Gaussian graph alignment
Machine Learning (Stat)
Makes computers solve hard graph puzzles faster.