Score: 0

Fault-Tolerant Decentralized Distributed Asynchronous Federated Learning with Adaptive Termination Detection

Published: September 2, 2025 | arXiv ID: 2509.02186v1

By: Phani Sahasra Akkinepally , Manaswini Piduguralla , Sushant Joshi and more

Potential Business Impact:

Lets computers learn together without sharing private data.

Business Areas:

Peer to Peer Collaboration

Federated Learning (FL) facilitates collaborative model training across distributed clients while ensuring data privacy. Traditionally, FL relies on a centralized server to coordinate learning, which creates bottlenecks and a single point of failure. Decentralized FL architectures eliminate the need for a central server and can operate in either synchronous or asynchronous modes. Synchronous FL requires all clients to compute updates and wait for one another before aggregation, guaranteeing consistency but often suffering from delays due to slower participants. Asynchronous FL addresses this by allowing clients to update independently, offering better scalability and responsiveness in heterogeneous environments. Our research develops an asynchronous decentralized FL approach in two progressive phases. (a) In Phase 1, we develop an asynchronous FL framework that enables clients to learn and update independently, removing the need for strict synchronization. (b) In Phase 2, we extend this framework with fault tolerance mechanisms to handle client failures and message drops, ensuring robust performance even under unpredictable conditions. As a central contribution, we propose Client-Confident Convergence and Client-Responsive Termination novel techniques that provide each client with the ability to autonomously determine appropriate termination points. These methods ensure that all active clients conclude meaningfully and efficiently, maintaining reliable convergence despite the challenges of asynchronous communication and faults.

Robust Federated Learning under Adversarial Attacks via Loss-Based Client Clustering

Machine Learning (CS)

Protects smart learning from bad data.

18 Aug 2025 1

90%

Robust Federated Learning under Adversarial Attacks via Loss-Based Client Clustering

Machine Learning (CS)

Keeps AI learning safe from bad data.

18 Aug 2025 0

89%

DAG-AFL:Directed Acyclic Graph-based Asynchronous Federated Learning

Machine Learning (CS)

Makes learning faster and better for many computers.

28 Jul 2025 1

View PDF Login to Bookmark

Page Count

20 pages

Fault-Tolerant Decentralized Distributed Asynchronous Federated Learning with Adaptive Termination Detection

Lets computers learn together without sharing private data.

Technical Abstract

Robust Federated Learning under Adversarial Attacks via Loss-Based Client Clustering

Robust Federated Learning under Adversarial Attacks via Loss-Based Client Clustering

DAG-AFL:Directed Acyclic Graph-based Asynchronous Federated Learning