Distributed Perceptron under Bounded Staleness, Partial Participation, and Noisy Communication
By: Keval Jain , Anant Raj , Saurav Prakash and more
Potential Business Impact:
**Helps computers learn faster with slow connections.**
We study a semi-asynchronous client-server perceptron trained via iterative parameter mixing (IPM-style averaging): clients run local perceptron updates and a server forms a global model by aggregating the updates that arrive in each communication round. The setting captures three system effects in federated and distributed deployments: (i) stale updates due to delayed model delivery and delayed application of client computations (two-sided version lag), (ii) partial participation (intermittent client availability), and (iii) imperfect communication on both downlink and uplink, modeled as effective zero-mean additive noise with bounded second moment. We introduce a server-side aggregation rule called staleness-bucket aggregation with padding that deterministically enforces a prescribed staleness profile over update ages without assuming any stochastic model for delays or participation. Under margin separability and bounded data radius, we prove a finite-horizon expected bound on the cumulative weighted number of perceptron mistakes over a given number of server rounds: the impact of delay appears only through the mean enforced staleness, whereas communication noise contributes an additional term that grows on the order of the square root of the horizon with the total noise energy. In the noiseless case, we show how a finite expected mistake budget yields an explicit finite-round stabilization bound under a mild fresh-participation condition.
Similar Papers
Delayed Momentum Aggregation: Communication-efficient Byzantine-robust Federated Learning with Partial Participation
Machine Learning (CS)
Keeps AI learning safe from bad data.
Optimizing Asynchronous Federated Learning: A Delicate Trade-Off Between Model-Parameter Staleness and Update Frequency
Machine Learning (CS)
Makes AI learn faster from many computers.
One-Shot Federated Ridge Regression: Exact Recovery via Sufficient Statistic Aggregation
Machine Learning (CS)
Lets computers learn from data in one go.