New Statistical and Computational Results for Learning Junta Distributions
By: Lorenzo Beretta
Potential Business Impact:
Finds hidden patterns in data using few clues.
We study the problem of learning junta distributions on $\{0, 1\}^n$, where a distribution is a $k$-junta if its probability mass function depends on a subset of at most $k$ variables. We make two main contributions: - We show that learning $k$-junta distributions is \emph{computationally} equivalent to learning $k$-parity functions with noise (LPN), a landmark problem in computational learning theory. - We design an algorithm for learning junta distributions whose statistical complexity is optimal, up to polylogarithmic factors. Computationally, our algorithm matches the complexity of previous (non-sample-optimal) algorithms. Combined, our two contributions imply that our algorithm cannot be significantly improved, statistically or computationally, barring a breakthrough for LPN.
Similar Papers
Learning Juntas under Markov Random Fields
Machine Learning (CS)
Teaches computers to learn patterns from data.
A Mysterious Connection Between Tolerant Junta Testing and Agnostically Learning Conjunctions
Data Structures and Algorithms
Finds patterns in data faster.
Feature Selection and Junta Testing are Statistically Equivalent
Machine Learning (CS)
Finds important clues in data faster.