Score: 1

SAD Neural Networks: Divergent Gradient Flows and Asymptotic Optimality via o-minimal Structures

Published: May 14, 2025 | arXiv ID: 2505.09572v1

By: Julian Kranz , Davide Gallon , Steffen Dereich and more

Potential Business Impact:

Helps AI learn better by understanding math.

Business Areas:

A/B Testing Data and Analytics

We study gradient flows for loss landscapes of fully connected feed forward neural networks with commonly used continuously differentiable activation functions such as the logistic, hyperbolic tangent, softplus or GELU function. We prove that the gradient flow either converges to a critical point or diverges to infinity while the loss converges to an asymptotic critical value. Moreover, we prove the existence of a threshold $\varepsilon>0$ such that the loss value of any gradient flow initialized at most $\varepsilon$ above the optimal level converges to it. For polynomial target functions and sufficiently big architecture and data set, we prove that the optimal loss value is zero and can only be realized asymptotically. From this setting, we deduce our main result that any gradient flow with sufficiently good initialization diverges to infinity. Our proof heavily relies on the geometry of o-minimal structures. We confirm these theoretical findings with numerical experiments and extend our investigation to real-world scenarios, where we observe an analogous behavior.

Gradient Flow Equations for Deep Linear Neural Networks: A Survey from a Network Perspective

Machine Learning (CS)

Helps computers learn by simplifying math.

13 Nov 2025 0

88%

On the (almost) Global Exponential Convergence of the Overparameterized Policy Optimization for the LQR Problem

Optimization and Control

Makes computer learning faster and better.

2 Oct 2025 0

88%

Non-Singularity of the Gradient Descent map for Neural Networks with Piecewise Analytic Activations

Optimization and Control

Helps computers learn better, even complex ones.

28 Oct 2025 0

View PDF Login to Bookmark

Country of Origin

🇩🇪 Germany

Repos / Data Links

github.com

Page Count

27 pages

SAD Neural Networks: Divergent Gradient Flows and Asymptotic Optimality via o-minimal Structures

Helps AI learn better by understanding math.

Technical Abstract

Gradient Flow Equations for Deep Linear Neural Networks: A Survey from a Network Perspective

On the (almost) Global Exponential Convergence of the Overparameterized Policy Optimization for the LQR Problem

Non-Singularity of the Gradient Descent map for Neural Networks with Piecewise Analytic Activations