L-SR1: Learned Symmetric-Rank-One Preconditioning
By: Gal Lifshitz , Shahar Zuler , Ori Fouks and more
Potential Business Impact:
Teaches computers to learn faster without lots of examples.
End-to-end deep learning has achieved impressive results but remains limited by its reliance on large labeled datasets, poor generalization to unseen scenarios, and growing computational demands. In contrast, classical optimization methods are data-efficient and lightweight but often suffer from slow convergence. While learned optimizers offer a promising fusion of both worlds, most focus on first-order methods, leaving learned second-order approaches largely unexplored. We propose a novel learned second-order optimizer that introduces a trainable preconditioning unit to enhance the classical Symmetric-Rank-One (SR1) algorithm. This unit generates data-driven vectors used to construct positive semi-definite rank-one matrices, aligned with the secant constraint via a learned projection. Our method is evaluated through analytic experiments and on the real-world task of Monocular Human Mesh Recovery (HMR), where it outperforms existing learned optimization-based approaches. Featuring a lightweight model and requiring no annotated data or fine-tuning, our approach offers strong generalization and is well-suited for integration into broader optimization-based frameworks.
Similar Papers
The Potential of Second-Order Optimization for LLMs: A Study with Full Gauss-Newton
Machine Learning (CS)
Trains AI models much faster using smart math.
Towards Guided Descent: Optimization Algorithms for Training Neural Networks At Scale
Machine Learning (CS)
Makes AI learn faster and understand itself better.
Turbo-Muon: Accelerating Orthogonality-Based Optimization with Pre-Conditioning
Artificial Intelligence
Makes computer training faster and better.