Distributed Online Convex Optimization with Efficient Communication: Improved Algorithm and Lower bounds
By: Sifan Yang , Wenhao Yang , Wei Jiang and more
Potential Business Impact:
Helps computers learn together faster and better.
We investigate distributed online convex optimization with compressed communication, where $n$ learners connected by a network collaboratively minimize a sequence of global loss functions using only local information and compressed data from neighbors. Prior work has established regret bounds of $O(\max\{ω^{-2}ρ^{-4}n^{1/2},ω^{-4}ρ^{-8}\}n\sqrt{T})$ and $O(\max\{ω^{-2}ρ^{-4}n^{1/2},ω^{-4}ρ^{-8}\}n\ln{T})$ for convex and strongly convex functions, respectively, where $ω\in(0,1]$ is the compression quality factor ($ω=1$ means no compression) and $ρ<1$ is the spectral gap of the communication matrix. However, these regret bounds suffer from a \emph{quadratic} or even \emph{quartic} dependence on $ω^{-1}$. Moreover, the \emph{super-linear} dependence on $n$ is also undesirable. To overcome these limitations, we propose a novel algorithm that achieves improved regret bounds of $\tilde{O}(ω^{-1/2}ρ^{-1}n\sqrt{T})$ and $\tilde{O}(ω^{-1}ρ^{-2}n\ln{T})$ for convex and strongly convex functions, respectively. The primary idea is to design a \emph{two-level blocking update framework} incorporating two novel ingredients: an online gossip strategy and an error compensation scheme, which collaborate to \emph{achieve a better consensus} among learners. Furthermore, we establish the first lower bounds for this problem, justifying the optimality of our results with respect to both $ω$ and $T$. Additionally, we consider the bandit feedback scenario, and extend our method with the classic gradient estimators to enhance existing regret bounds.
Similar Papers
Communication-Efficient Distributed Online Nonconvex Optimization with Time-Varying Constraints
Optimization and Control
Helps robots learn tasks with less data.
Characterizing the Accuracy-Communication-Privacy Trade-off in Distributed Stochastic Convex Optimization
Machine Learning (CS)
Keeps data private while learning from many computers.
Decentralized Projection-free Online Upper-Linearizable Optimization with Applications to DR-Submodular Optimization
Optimization and Control
Helps computers learn without needing all the answers.