Zero Generalization Error Theorem for Random Interpolators via Algebraic Geometry
By: Naoki Yoshida, Isao Ishikawa, Masaaki Imaizumi
Potential Business Impact:
Makes AI learn perfectly with enough data.
We theoretically demonstrate that the generalization error of interpolators for machine learning models under teacher-student settings becomes 0 once the number of training samples exceeds a certain threshold. Understanding the high generalization ability of large-scale models such as deep neural networks (DNNs) remains one of the central open problems in machine learning theory. While recent theoretical studies have attributed this phenomenon to the implicit bias of stochastic gradient descent (SGD) toward well-generalizing solutions, empirical evidences indicate that it primarily stems from properties of the model itself. Specifically, even randomly sampled interpolators, which are parameters that achieve zero training error, have been observed to generalize effectively. In this study, under a teacher-student framework, we prove that the generalization error of randomly sampled interpolators becomes exactly zero once the number of training samples exceeds a threshold determined by the geometric structure of the interpolator set in parameter space. As a proof technique, we leverage tools from algebraic geometry to mathematically characterize this geometric structure.
Similar Papers
Multitask Learning with Stochastic Interpolants
Machine Learning (CS)
Creates AI that learns many tasks without retraining.
Multitask Learning with Stochastic Interpolants
Machine Learning (CS)
Creates one AI that does many different jobs.
Likely Interpolants of Generative Models
Machine Learning (CS)
Makes AI create smoother, more realistic images.