Score: 1

Sobolev acceleration for neural networks

Published: September 24, 2025 | arXiv ID: 2509.19773v1

By: Jong Kwon Oh, Hanbaek Lyu, Hwijae Son

Potential Business Impact:

Makes computer learning faster and better.

Business Areas:
Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Sobolev training, which integrates target derivatives into the loss functions, has been shown to accelerate convergence and improve generalization compared to conventional $L^2$ training. However, the underlying mechanisms of this training method remain only partially understood. In this work, we present the first rigorous theoretical framework proving that Sobolev training accelerates the convergence of Rectified Linear Unit (ReLU) networks. Under a student-teacher framework with Gaussian inputs and shallow architectures, we derive exact formulas for population gradients and Hessians, and quantify the improvements in conditioning of the loss landscape and gradient-flow convergence rates. Extensive numerical experiments validate our theoretical findings and show that the benefits of Sobolev training extend to modern deep learning tasks.

Country of Origin
πŸ‡ΊπŸ‡Έ πŸ‡°πŸ‡· United States, Korea, Republic of

Page Count
24 pages

Category
Computer Science:
Machine Learning (CS)