Score: 0

Federated Learning Framework for Scalable AI in Heterogeneous HPC and Cloud Environments

Published: November 22, 2025 | arXiv ID: 2511.19479v1

By: Sangam Ghimire , Paribartan Timalsina , Nirjal Bhurtel and more

Potential Business Impact:

Trains AI on many computers without sharing private data.

Business Areas:

Machine Learning Artificial Intelligence, Data and Analytics, Software

As the demand grows for scalable and privacy-aware AI systems, Federated Learning (FL) has emerged as a promising solution, allowing decentralized model training without moving raw data. At the same time, the combination of high-performance computing (HPC) and cloud infrastructure offers vast computing power but introduces new complexities, especially when dealing with heterogeneous hardware, communication limits, and non-uniform data. In this work, we present a federated learning framework built to run efficiently across mixed HPC and cloud environments. Our system addresses key challenges such as system heterogeneity, communication overhead, and resource scheduling, while maintaining model accuracy and data privacy. Through experiments on a hybrid testbed, we demonstrate strong performance in terms of scalability, fault tolerance, and convergence, even under non-Independent and Identically Distributed (non-IID) data distributions and varied hardware. These results highlight the potential of federated learning as a practical approach to building scalable Artificial Intelligence (AI) systems in modern, distributed computing settings.

Experiences Building Enterprise-Level Privacy-Preserving Federated Learning to Power AI for Science

Distributed, Parallel, and Cluster Computing

Lets AI learn from private data safely.

12 Nov 2025 0

92%

Federated Learning Survey: A Multi-Level Taxonomy of Aggregation Techniques, Experimental Insights, and Future Frontiers

Machine Learning (CS)

Lets computers learn together without sharing secrets.

27 Nov 2025 0

91%

Federated Learning: A Survey on Privacy-Preserving Collaborative Intelligence

Machine Learning (CS)

Trains computers together without sharing private info.

24 Apr 2025 0

View PDF Login to Bookmark

Page Count

13 pages

Federated Learning Framework for Scalable AI in Heterogeneous HPC and Cloud Environments

Trains AI on many computers without sharing private data.

Technical Abstract

Experiences Building Enterprise-Level Privacy-Preserving Federated Learning to Power AI for Science

Federated Learning Survey: A Multi-Level Taxonomy of Aggregation Techniques, Experimental Insights, and Future Frontiers

Federated Learning: A Survey on Privacy-Preserving Collaborative Intelligence