AutoGD: Automatic Learning Rate Selection for Gradient Descent
By: Nikola Surjanovic, Alexandre Bouchard-Côté, Trevor Campbell
Potential Business Impact:
**Computer learns best speed for tasks automatically.**
The performance of gradient-based optimization methods, such as standard gradient descent (GD), greatly depends on the choice of learning rate. However, it can require a non-trivial amount of user tuning effort to select an appropriate learning rate schedule. When such methods appear as inner loops of other algorithms, expecting the user to tune the learning rates may be impractical. To address this, we introduce AutoGD: a gradient descent method that automatically determines whether to increase or decrease the learning rate at a given iteration. We establish the convergence of AutoGD, and show that we can recover the optimal rate of GD (up to a constant) for a broad class of functions without knowledge of smoothness constants. Experiments on a variety of traditional problems and variational inference optimization tasks demonstrate strong performance of the method, along with its extensions to AutoBFGS and AutoLBFGS.
Similar Papers
Gradient Descent with Provably Tuned Learning-rate Schedules
Machine Learning (CS)
Teaches computers to learn better, even when tricky.
Arc Gradient Descent: A Mathematically Derived Reformulation of Gradient Descent with Phase-Aware, User-Controlled Step Dynamics
Machine Learning (CS)
Makes computer learning find better answers faster.
Optimizing Optimizers for Fast Gradient-Based Learning
Machine Learning (CS)
Creates better computer learning by designing smarter math tools.