A Novel Hierarchical Integration Method for Efficient Model Merging in Medical LLMs
By: Prakrit Timilsina , Anuj Nepal , Rajan Kadel and more
Potential Business Impact:
Combines medical AI knowledge without losing privacy.
Large Language Models (LLMs) face significant challenges in distributed healthcare, including consolidating specialized domain knowledge across institutions while maintaining privacy, reducing computational overhead, and preventing catastrophic forgetting during model updates.This paper presents a systematic evaluation of six parameter-space merging techniques applied to two architecturally compatible medical LLMs derived from the Mistral-7B base model. We introduce a novel hierarchical method that combines selective Optimal Transport (OT) alignment for attention layers with cosine similarity-weighted interpolation, designed to address permutation variance while minimizing computational overhead for edge deployment scenarios. Our study evaluates Task Arithmetic, Linear Averaging, DARE-TIES, DELLA, Breadcrumbs, and our Hierarchical approach across five medical benchmarks. Results demonstrate that architecturally compatible models benefit significantly from simple averaging methods, with Task Arithmetic achieving 45.80% accuracy on MedQA, outperforming complex pruning-based approaches. These findings offer critical insights for the deployment of distributed medical AI in resource-constrained IoT environments, where computational efficiency and model compatibility are paramount. Our work establishes that for architecturally compatible models, simple averaging provides a robust and computationally efficient baseline for knowledge consolidation, offering a pragmatic path forward for scalable medical AI systems.
Similar Papers
A Systematic Study of Model Merging Techniques in Large Language Models
Computation and Language
Combines AI models to make them smarter without retraining.
An Empirical Survey of Model Merging Algorithms for Social Bias Mitigation
Computation and Language
Fixes AI bias, but can make AI less smart.
Multi-LLM Collaboration for Medication Recommendation
Machine Learning (CS)
Makes AI doctors give safer medicine advice.