Score: 0

Stochastic Bilevel Optimization with Heavy-Tailed Noise

Published: September 18, 2025 | arXiv ID: 2509.14952v1

By: Zhuanghua Liu, Luo Luo

Potential Business Impact:

Teaches computers to learn better from messy data.

Business Areas:

A/B Testing Data and Analytics

This paper considers the smooth bilevel optimization in which the lower-level problem is strongly convex and the upper-level problem is possibly nonconvex. We focus on the stochastic setting that the algorithm can access the unbiased stochastic gradient evaluation with heavy-tailed noise, which is prevalent in many machine learning applications such as training large language models and reinforcement learning. We propose a nested-loop normalized stochastic bilevel approximation (N$^2$SBA) for finding an $\epsilon$-stationary point with the stochastic first-order oracle (SFO) complexity of $\tilde{\mathcal{O}}\big(\kappa^{\frac{7p-3}{p-1}} \sigma^{\frac{p}{p-1}} \epsilon^{-\frac{4 p - 2}{p-1}}\big)$, where $\kappa$ is the condition number, $p\in(1,2]$ is the order of central moment for the noise, and $\sigma$ is the noise level. Furthermore, we specialize our idea to solve the nonconvex-strongly-concave minimax optimization problem, achieving an $\epsilon$-stationary point with the SFO complexity of $\tilde{\mathcal O}\big(\kappa^{\frac{2p-1}{p-1}} \sigma^{\frac{p}{p-1}} \epsilon^{-\frac{3p-2}{p-1}}\big)$. All above upper bounds match the best-known results under the special case of the bounded variance setting, i.e., $p=2$.

Nonconvex Decentralized Stochastic Bilevel Optimization under Heavy-Tailed Noises

Machine Learning (CS)

Teaches computers to learn better with messy data.

19 Sep 2025 0

92%

Faster Gradient Methods for Highly-smooth Stochastic Bilevel Optimization

Optimization and Control

Makes smart computer learning faster and better.

3 Sep 2025 0

91%

Lower Complexity Bounds for Nonconvex-Strongly-Convex Bilevel Optimization with First-Order Oracles

Machine Learning (CS)

Makes solving tricky math problems much faster.

24 Nov 2025 0

View PDF Login to Bookmark

Page Count

40 pages

Stochastic Bilevel Optimization with Heavy-Tailed Noise

Teaches computers to learn better from messy data.

Technical Abstract

Nonconvex Decentralized Stochastic Bilevel Optimization under Heavy-Tailed Noises

Faster Gradient Methods for Highly-smooth Stochastic Bilevel Optimization

Lower Complexity Bounds for Nonconvex-Strongly-Convex Bilevel Optimization with First-Order Oracles