Approximate Replicability in Learning
By: Max Hopkins, Russell Impagliazzo, Christopher Ye
Potential Business Impact:
Makes computer learning work even with messy data.
Replicability, introduced by (Impagliazzo et al. STOC '22), is the notion that algorithms should remain stable under a resampling of their inputs (given access to shared randomness). While a strong and interesting notion of stability, the cost of replicability can be prohibitive: there is no replicable algorithm, for instance, for tasks as simple as threshold learning (Bun et al. STOC '23). Given such strong impossibility results we ask: under what approximate notions of replicability is learning possible? In this work, we propose three natural relaxations of replicability in the context of PAC learning: (1) Pointwise: the learner must be consistent on any fixed input, but not across all inputs simultaneously, (2) Approximate: the learner must output hypotheses that classify most of the distribution consistently, (3) Semi: the algorithm is fully replicable, but may additionally use shared unlabeled samples. In all three cases, for constant replicability parameters, we obtain sample-optimal agnostic PAC learners: (1) and (2) are achievable for ``free" using $\Theta(d/\alpha^2)$ samples, while (3) requires $\Theta(d^2/\alpha^2)$ labeled samples.
Similar Papers
List Replicable Reinforcement Learning
Machine Learning (CS)
Makes computer learning more reliable and predictable.
Replicable Reinforcement Learning with Linear Function Approximation
Machine Learning (CS)
Makes AI learn the same way every time.
Learning under Distributional Drift: Reproducibility as an Intrinsic Statistical Resource
Machine Learning (CS)
Limits how fast learning systems can adapt.