Score: 0

New Statistical and Computational Results for Learning Junta Distributions

Published: May 9, 2025 | arXiv ID: 2505.05819v3

By: Lorenzo Beretta

Potential Business Impact:

Finds hidden patterns in data using few clues.

Business Areas:

Analytics Data and Analytics

We study the problem of learning junta distributions on $\{0, 1\}^n$, where a distribution is a $k$-junta if its probability mass function depends on a subset of at most $k$ variables. We make two main contributions: - We show that learning $k$-junta distributions is \emph{computationally} equivalent to learning $k$-parity functions with noise (LPN), a landmark problem in computational learning theory. - We design an algorithm for learning junta distributions whose statistical complexity is optimal, up to polylogarithmic factors. Computationally, our algorithm matches the complexity of previous (non-sample-optimal) algorithms. Combined, our two contributions imply that our algorithm cannot be significantly improved, statistically or computationally, barring a breakthrough for LPN.