Score: 1

Mitigating Parameter Interference in Model Merging via Sharpness-Aware Fine-Tuning

Published: April 20, 2025 | arXiv ID: 2504.14662v1

By: Yeoreum Lee, Jinwook Jung, Sungyong Baik

Potential Business Impact:

Combines AI models without losing their skills.

Business Areas:

Personalization Commerce and Shopping

Large-scale deep learning models with a pretraining-finetuning paradigm have led to a surge of numerous task-specific models fine-tuned from a common pre-trained model. Recently, several research efforts have been made on merging these large models into a single multi-task model, particularly with simple arithmetic on parameters. Such merging methodology faces a central challenge: interference between model parameters fine-tuned on different tasks. Few recent works have focused on designing a new fine-tuning scheme that can lead to small parameter interference, however at the cost of the performance of each task-specific fine-tuned model and thereby limiting that of a merged model. To improve the performance of a merged model, we note that a fine-tuning scheme should aim for (1) smaller parameter interference and (2) better performance of each fine-tuned model on the corresponding task. In this work, we aim to design a new fine-tuning objective function to work towards these two goals. In the course of this process, we find such objective function to be strikingly similar to sharpness-aware minimization (SAM) objective function, which aims to achieve generalization by finding flat minima. Drawing upon our observation, we propose to fine-tune pre-trained models via sharpness-aware minimization. The experimental and theoretical results showcase the effectiveness and orthogonality of our proposed approach, improving performance upon various merging and fine-tuning methods. Our code is available at https://github.com/baiklab/SAFT-Merge.

To See a World in a Spark of Neuron: Disentangling Multi-task Interference for Training-free Model Merging

Machine Learning (CS)

Combines AI skills without forgetting old ones.

7 Mar 2025 0

87%

Asynchronous Sharpness-Aware Minimization For Fast and Accurate Deep Learning

Machine Learning (CS)

Makes smart computer programs learn faster and better.

14 Mar 2025 0

86%

SAMO: A Lightweight Sharpness-Aware Approach for Multi-Task Optimization with Joint Global-Local Perturbation

Machine Learning (CS)

Makes AI learn many things better, faster.

10 Jul 2025 1

View PDF Login to Bookmark

Country of Origin

🇰🇷 Korea, Republic of

Repos / Data Links

github.com

Page Count

23 pages

Mitigating Parameter Interference in Model Merging via Sharpness-Aware Fine-Tuning

Combines AI models without losing their skills.

Technical Abstract

To See a World in a Spark of Neuron: Disentangling Multi-task Interference for Training-free Model Merging

Asynchronous Sharpness-Aware Minimization For Fast and Accurate Deep Learning

SAMO: A Lightweight Sharpness-Aware Approach for Multi-Task Optimization with Joint Global-Local Perturbation