Score: 0

Convergence, Sticking and Escape: Stochastic Dynamics Near Critical Points in SGD

Published: May 24, 2025 | arXiv ID: 2505.18535v1

By: Dmitry Dudukalov , Artem Logachov , Vladimir Lotov and more

Potential Business Impact:

Helps computers find the best answer faster.

Business Areas:

A/B Testing Data and Analytics

We study the convergence properties and escape dynamics of Stochastic Gradient Descent (SGD) in one-dimensional landscapes, separately considering infinite- and finite-variance noise. Our main focus is to identify the time scales on which SGD reliably moves from an initial point to the local minimum in the same ''basin''. Under suitable conditions on the noise distribution, we prove that SGD converges to the basin's minimum unless the initial point lies too close to a local maximum. In that near-maximum scenario, we show that SGD can linger for a long time in its neighborhood. For initial points near a ''sharp'' maximum, we show that SGD does not remain stuck there, and we provide results to estimate the probability that it will reach each of the two neighboring minima. Overall, our findings present a nuanced view of SGD's transitions between local maxima and minima, influenced by both noise characteristics and the underlying function geometry.

Transient learning dynamics drive escape from sharp valleys in Stochastic Gradient Descent

Machine Learning (CS)

Makes AI learn better by finding smoother paths.

16 Jan 2026 0

90%

The global convergence time of stochastic gradient descent in non-convex landscapes: Sharp estimates via large deviations

Optimization and Control

Helps computers learn faster by finding best answers.

20 Mar 2025 0

88%

Quantitative Convergence Analysis of Projected Stochastic Gradient Descent for Non-Convex Losses via the Goldstein Subdifferential

Optimization and Control

Makes AI learn faster without needing extra tricks.

3 Oct 2025 0

View PDF Login to Bookmark

Page Count

27 pages

Convergence, Sticking and Escape: Stochastic Dynamics Near Critical Points in SGD

Helps computers find the best answer faster.

Technical Abstract

Transient learning dynamics drive escape from sharp valleys in Stochastic Gradient Descent

The global convergence time of stochastic gradient descent in non-convex landscapes: Sharp estimates via large deviations

Quantitative Convergence Analysis of Projected Stochastic Gradient Descent for Non-Convex Losses via the Goldstein Subdifferential