TPV: Parameter Perturbations Through the Lens of Test Prediction Variance
By: Devansh Arpit
We identify test prediction variance (TPV) -- the first-order sensitivity of model outputs to parameter perturbations around a trained solution -- as a unifying quantity that links several classical observations about generalization in deep networks. TPV is a fully label-free object whose trace form separates the geometry of the trained model from the specific perturbation mechanism, allowing a broad family of parameter perturbations like SGD noise, label noise, finite-precision noise, and other post-training perturbations to be analyzed under a single framework. Theoretically, we show that TPV estimated on the training set converges to its test-set value in the overparameterized limit, providing the first result that prediction variance under local parameter perturbations can be inferred from training inputs alone. Empirically, TPV exhibits a striking stability across datasets and architectures -- including extremely narrow networks -- and correlates well with clean test loss. Finally, we demonstrate that modeling pruning as a TPV perturbation yields a simple label-free importance measure that performs competitively with state-of-the-art pruning methods, illustrating the practical utility of TPV. Code available at github.com/devansharpit/TPV.
Similar Papers
Patch-based learning of adaptive Total Variation parameter maps for blind image denoising
Image and Video Processing
Cleans up blurry pictures automatically.
Learnable Total Variation with Lambda Mapping for Low-Dose CT Denoising
CV and Pattern Recognition
Makes blurry medical pictures clearer and sharper.
A Mathematical Theory of Top-$k$ Sparse Attention via Total Variation Distance
Machine Learning (CS)
Makes AI models faster by ignoring less important parts.