Efficiently Escaping Saddle Points under Generalized Smoothness via Self-Bounding Regularity
By: Daniel Yiming Cao , August Y. Chen , Karthik Sridharan and more
Potential Business Impact:
Helps computers find the best answers faster.
We study the optimization of non-convex functions that are not necessarily smooth (gradient and/or Hessian are Lipschitz) using first order methods. Smoothness is a restrictive assumption in machine learning in both theory and practice, motivating significant recent work on finding first order stationary points of functions satisfying generalizations of smoothness with first order methods. We develop a novel framework that lets us systematically study the convergence of a large class of first-order optimization algorithms (which we call decrease procedures) under generalizations of smoothness. We instantiate our framework to analyze the convergence of first order optimization algorithms to first and \textit{second} order stationary points under generalizations of smoothness. As a consequence, we establish the first convergence guarantees for first order methods to second order stationary points under generalizations of smoothness. We demonstrate that several canonical examples fall under our framework, and highlight practical implications.
Similar Papers
Escaping Saddle Points via Curvature-Calibrated Perturbations: A Complete Analysis with Explicit Constants and Empirical Validation
Machine Learning (CS)
Helps computers find best answers faster.
The Adaptive Complexity of Finding a Stationary Point
Optimization and Control
Makes computer learning faster with more parallel work.
Gradient-Normalized Smoothness for Optimization with Approximate Hessians
Optimization and Control
Makes computer learning faster and more reliable.