Score: 0

Distributed Bilevel Optimization with Dual Pruning for Resource-limited Clients

Published: December 31, 2025 | arXiv ID: 2512.24667v1

By: Mingyi Li , Xiao Zhang , Ruisheng Zheng and more

Potential Business Impact:

Lets phones train AI without needing super computers.

Business Areas:

A/B Testing Data and Analytics

With the development of large-scale models, traditional distributed bilevel optimization algorithms cannot be applied directly in low-resource clients. The key reason lies in the excessive computation involved in optimizing both the lower- and upper-level functions. Thus, we present the first resource-adaptive distributed bilevel optimization framework with a second-order free hypergradient estimator, which allows each client to optimize the submodels adapted to the available resources. Due to the coupled influence of partial outer parameters x and inner parameters y, it's challenging to theoretically analyze the upper bound regarding the globally averaged hypergradient for full model parameters. The error bound of inner parameter also needs to be reformulated since the local partial training. The provable theorems show that both RABO and RAFBO can achieve an asymptotically optimal convergence rate of $O(1/\sqrt{C_x^{\ast}Q})$, which is dominated by the minimum coverage of the outer parameter $C_x^{\ast}$. Extensive experiments on two different tasks demonstrate the effectiveness and computation efficiency of our proposed methods.