Federated Learning with Layer Skipping: Efficient Training of Large Language Models for Healthcare NLP
By: Lihong Zhang, Yue Li
Potential Business Impact:
Trains AI for doctors without sharing patient info.
Federated learning (FL) enables collaborative model training across organizations without sharing raw data, addressing crucial privacy concerns in healthcare natural language processing (NLP). However, training large language models (LLMs) in federated settings faces significant challenges, including communication overhead and data heterogeneity. We propose Layer-Skipping Federated Learning, where only selected layers of a pre-trained LLM are fine-tuned across clients while others remain frozen. Applied to LLaMA 3.2-1B, our approach reduces communication costs by approximately 70% while maintaining performance within 2% of centralized training. We evaluate our method on clinical NER and classification tasks using i2b2 and MIMIC-III datasets. Our experiments demonstrate that Layer-Skipping FL outperforms competitive baselines, handles non-IID clinical data distributions effectively, and shows robustness when combined with differential privacy. This approach represents a practical solution for privacy-preserving collaborative learning in healthcare NLP.
Similar Papers
Selective Attention Federated Learning: Improving Privacy and Efficiency for Clinical Text Classification
Computation and Language
Trains AI on private health data faster, safer.
CacheFL: Privacy-Preserving and Efficient Federated Cache Model Fine-Tuning for Vision-Language Models
Distributed, Parallel, and Cluster Computing
Trains AI to see better, safely, using less data.
FedMentalCare: Towards Privacy-Preserving Fine-Tuned LLMs to Analyze Mental Health Status Using Federated Learning Framework
Computation and Language
Keeps mental health chats private for AI.