Score: 1

HFedMoE: Resource-aware Heterogeneous Federated Learning with Mixture-of-Experts

Published: January 2, 2026 | arXiv ID: 2601.00583v1

By: Zihan Fang , Zheng Lin , Senkang Hu and more

Potential Business Impact:

Trains smart AI on phones without sharing private data.

Business Areas:

MOOC Education, Software

While federated learning (FL) enables fine-tuning of large language models (LLMs) without compromising data privacy, the substantial size of an LLM renders on-device training impractical for resource-constrained clients, such as mobile devices. Thus, Mixture-of-Experts (MoE) models have emerged as a computation-efficient solution, which activates only a sparse subset of experts during model training to reduce computing burden without sacrificing performance. Though integrating MoE into FL fine-tuning holds significant potential, it still encounters three key challenges: i) selecting appropriate experts for clients remains challenging due to the lack of a reliable metric to measure each expert's impact on local fine-tuning performance, ii) the heterogeneous computing resources across clients severely hinder MoE-based LLM fine-tuning, as dynamic expert activations across diverse input samples can overwhelm resource-constrained devices, and iii) client-specific expert subsets and routing preference undermine global aggregation, where misaligned expert updates and inconsistent gating networks in troduce destructive interference. To address these challenges, we propose HFedMoE, a heterogeneous MoE-based FL fine-tuning framework that customizes a subset of experts to each client for computation-efficient LLM fine-tuning. Specifically, HFedMoE identifies the expert importance based on its contributions to fine-tuning performance, and then adaptively selects a subset of experts from an information bottleneck perspective to align with each client' s computing budget. A sparsity-aware model aggregation strategy is also designed to aggregate the actively fine-tuned experts and gating parameters with importance weighted contributions. Extensive experiments demonstrate that HFedMoE outperforms state-of-the-art benchmarks in training accuracy and convergence speed.

FLEX-MoE: Federated Mixture-of-Experts with Load-balanced Expert Assignment

Machine Learning (CS)

Helps AI learn better on phones with less data.

28 Dec 2025 0

94%

Federated Fine-Tuning of Sparsely-Activated Large Language Models on Resource-Constrained Devices

Distributed, Parallel, and Cluster Computing

Makes smart computer brains learn faster on weak computers.

26 Aug 2025 0

93%

FFT-MoE: Efficient Federated Fine-Tuning for Foundation Models via Large-scale Sparse MoE under Heterogeneous Edge

Machine Learning (CS)

Teaches AI to learn from many computers without sharing secrets.

26 Aug 2025 1

View PDF Login to Bookmark

Country of Origin

🇭🇰 Hong Kong

Page Count

14 pages

HFedMoE: Resource-aware Heterogeneous Federated Learning with Mixture-of-Experts

Trains smart AI on phones without sharing private data.

Technical Abstract

FLEX-MoE: Federated Mixture-of-Experts with Load-balanced Expert Assignment

Federated Fine-Tuning of Sparsely-Activated Large Language Models on Resource-Constrained Devices

FFT-MoE: Efficient Federated Fine-Tuning for Foundation Models via Large-scale Sparse MoE under Heterogeneous Edge