Convergence and Generalization of Anti-Regularization for Parametric Models
By: Dongseok Kim, Wonjun Jeong, Gisung Oh
Potential Business Impact:
Helps computers learn better with less information.
We propose Anti-regularization (AR), which adds a sign-reversed reward term to the loss to intentionally increase model expressivity in the small-sample regime, and then attenuates this intervention with a power-law decay as the sample size grows. We formalize spectral safety and trust-region conditions, and design a lightweight stability safeguard that combines a projection operator with gradient clipping, ensuring stable intervention under stated assumptions. Our analysis spans linear smoothers and the Neural Tangent Kernel (NTK) regime, providing practical guidance on selecting the decay exponent by balancing empirical risk against variance. Empirically, AR reduces underfitting while preserving generalization and improving calibration in both regression and classification. Ablation studies confirm that the decay schedule and the stability safeguard are critical to preventing overfitting and numerical instability. We further examine a degrees-of-freedom targeting schedule that keeps per-sample complexity approximately constant. AR is simple to implement and reproducible, integrating cleanly into standard empirical risk minimization pipelines. It enables robust learning in data- and resource-constrained settings by intervening only when beneficial and fading away when unnecessary.
Similar Papers
Convergence and Generalization of Anti-Regularization for Parametric Models
Machine Learning (CS)
Helps computers learn better with less data.
Adaptive Divergence Regularized Policy Optimization for Fine-tuning Generative Models
Machine Learning (CS)
Helps AI learn better and make cooler pictures.
Quantization through Piecewise-Affine Regularization: Optimization and Statistical Guarantees
Machine Learning (CS)
Makes computers learn better with less data.