Replicable Distribution Testing
By: Ilias Diakonikolas , Jingyi Gao , Daniel Kane and more
Potential Business Impact:
Tests if data groups are truly different.
We initiate a systematic investigation of distribution testing in the framework of algorithmic replicability. Specifically, given independent samples from a collection of probability distributions, the goal is to characterize the sample complexity of replicably testing natural properties of the underlying distributions. On the algorithmic front, we develop new replicable algorithms for testing closeness and independence of discrete distributions. On the lower bound front, we develop a new methodology for proving sample complexity lower bounds for replicable testing that may be of broader interest. As an application of our technique, we establish near-optimal sample complexity lower bounds for replicable uniformity testing -- answering an open question from prior work -- and closeness testing.
Similar Papers
On the Structure of Replicable Hypothesis Testers
Data Structures and Algorithms
Makes computer tests more trustworthy and reliable.
Distribution Testing in the Presence of Arbitrarily Dominant Noise with Verification Queries
Data Structures and Algorithms
Find hidden patterns in messy data faster.
Sample Complexity of Nonparametric Closeness Testing for Continuous Distributions and Its Application to Causal Discovery with Hidden Confounding
Machine Learning (CS)
Finds cause and effect in complex data.