Score: 3

Balancing Gradient and Hessian Queries in Non-Convex Optimization

Published: October 23, 2025 | arXiv ID: 2510.20786v1

By: Deeksha Adil , Brian Bullins , Aaron Sidford and more

BigTech Affiliations: Stanford University

Potential Business Impact:

Finds best answers faster using fewer math steps.

Business Areas:
A/B Testing Data and Analytics

We develop optimization methods which offer new trade-offs between the number of gradient and Hessian computations needed to compute the critical point of a non-convex function. We provide a method that for any twice-differentiable $f\colon \mathbb R^d \rightarrow \mathbb R$ with $L_2$-Lipschitz Hessian, input initial point with $\Delta$-bounded sub-optimality, and sufficiently small $\epsilon > 0$, outputs an $\epsilon$-critical point, i.e., a point $x$ such that $\|\nabla f(x)\| \leq \epsilon$, using $\tilde{O}(L_2^{1/4} n_H^{-1/2}\Delta\epsilon^{-9/4})$ queries to a gradient oracle and $n_H$ queries to a Hessian oracle for any positive integer $n_H$. As a consequence, we obtain an improved gradient query complexity of $\tilde{O}(d^{1/3}L_2^{1/2}\Delta\epsilon^{-3/2})$ in the case of bounded dimension and of $\tilde{O}(L_2^{3/4}\Delta^{3/2}\epsilon^{-9/4})$ in the case where we are allowed only a \emph{single} Hessian query. We obtain these results through a more general algorithm which can handle approximate Hessian computations and recovers the state-of-the-art bound of computing an $\epsilon$-critical point with $O(L_1^{1/2}L_2^{1/4}\Delta\epsilon^{-7/4})$ gradient queries provided that $f$ also has an $L_1$-Lipschitz gradient.

Country of Origin
πŸ‡¨πŸ‡­ πŸ‡ΊπŸ‡Έ United States, Switzerland

Page Count
49 pages

Category
Mathematics:
Optimization and Control