Constant Stepsize Local GD for Logistic Regression: Acceleration by Instability
By: Michael Crawshaw, Blake Woodworth, Mingrui Liu
Potential Business Impact:
Lets computers learn faster with uneven data.
Existing analysis of Local (Stochastic) Gradient Descent for heterogeneous objectives requires stepsizes $\eta \leq 1/K$ where $K$ is the communication interval, which ensures monotonic decrease of the objective. In contrast, we analyze Local Gradient Descent for logistic regression with separable, heterogeneous data using any stepsize $\eta > 0$. With $R$ communication rounds and $M$ clients, we show convergence at a rate $\mathcal{O}(1/\eta K R)$ after an initial unstable phase lasting for $\widetilde{\mathcal{O}}(\eta K M)$ rounds. This improves upon the existing $\mathcal{O}(1/R)$ rate for general smooth, convex objectives. Our analysis parallels the single machine analysis of~\cite{wu2024large} in which instability is caused by extremely large stepsizes, but in our setting another source of instability is large local updates with heterogeneous objectives.
Similar Papers
Large Stepsizes Accelerate Gradient Descent for Regularized Logistic Regression
Machine Learning (Stat)
Faster computer learning with big steps.
Minimax Optimal Convergence of Gradient Descent in Logistic Regression via Large and Adaptive Stepsizes
Machine Learning (Stat)
Teaches computers to learn faster and better.
Convergence Rates for Gradient Descent on the Edge of Stability in Overparametrised Least Squares
Machine Learning (CS)
Helps computers learn faster by finding better solutions.