Score: 0

Near-Optimal Convergence of Accelerated Gradient Methods under Generalized and $(L_0, L_1)$-Smoothness

Published: August 9, 2025 | arXiv ID: 2508.06884v1

By: Alexander Tyurin

Potential Business Impact:

Makes computer math problems solve faster.

We study first-order methods for convex optimization problems with functions $f$ satisfying the recently proposed $\ell$-smoothness condition $||\nabla^{2}f(x)|| \le \ell\left(||\nabla f(x)||\right),$ which generalizes the $L$-smoothness and $(L_{0},L_{1})$-smoothness. While accelerated gradient descent AGD is known to reach the optimal complexity $O(\sqrt{L} R / \sqrt{\varepsilon})$ under $L$-smoothness, where $\varepsilon$ is an error tolerance and $R$ is the distance between a starting and an optimal point, existing extensions to $\ell$-smoothness either incur extra dependence on the initial gradient, suffer exponential factors in $L_{1} R$, or require costly auxiliary sub-routines, leaving open whether an AGD-type $O(\sqrt{\ell(0)} R / \sqrt{\varepsilon})$ rate is possible for small-$\varepsilon$, even in the $(L_{0},L_{1})$-smoothness case. We resolve this open question. Leveraging a new Lyapunov function and designing new algorithms, we achieve $O(\sqrt{\ell(0)} R / \sqrt{\varepsilon})$ oracle complexity for small-$\varepsilon$ and virtually any $\ell$. For instance, for $(L_{0},L_{1})$-smoothness, our bound $O(\sqrt{L_0} R / \sqrt{\varepsilon})$ is provably optimal in the small-$\varepsilon$ regime and removes all non-constant multiplicative factors present in prior accelerated algorithms.

Complexity Lower Bounds of Adaptive Gradient Algorithms for Non-convex Stochastic Optimization under Relaxed Smoothness

Machine Learning (CS)

Makes computer learning slower for some problems.

7 May 2025 1

89%

Provably Convergent Decentralized Optimization over Directed Graphs under Generalized Smoothness

Optimization and Control

Helps computers learn faster with messy data.

7 Jan 2026 0

88%

Decentralized Stochastic Nonconvex Optimization under the Relaxed Smoothness

Optimization and Control

Helps many computers work together to solve problems.

10 Sep 2025 0

View PDF Login to Bookmark

Page Count

30 pages

Near-Optimal Convergence of Accelerated Gradient Methods under Generalized and $(L_0, L_1)$-Smoothness

Makes computer math problems solve faster.

Technical Abstract

Complexity Lower Bounds of Adaptive Gradient Algorithms for Non-convex Stochastic Optimization under Relaxed Smoothness

Provably Convergent Decentralized Optimization over Directed Graphs under Generalized Smoothness

Decentralized Stochastic Nonconvex Optimization under the Relaxed Smoothness