Parameter-Efficient Multi-Task Learning via Progressive Task-Specific Adaptation
By: Neeraj Gangwar , Anshuka Rangi , Rishabh Deshmukh and more
Potential Business Impact:
Teaches AI to do many jobs with less training.
Parameter-efficient fine-tuning methods have emerged as a promising solution for adapting pre-trained models to various downstream tasks. While these methods perform well in single-task learning, extending them to multi-task learning exacerbates common challenges, such as task interference and negative transfer, due to the limited number of trainable parameters. To address these issues, we introduce progressive task-specific multi-task adaptation, a novel parameter-efficient approach for multi-task learning. This approach introduces adapter modules in a pre-trained model such that these modules are shared across all tasks in the initial layers and become progressively more task-specific in the later layers. The motivation is to reduce the conflicts among tasks by allowing transfer learning across all tasks in the initial layers and enabling task-specific learning toward the prediction heads. Additionally, we propose a gradient-based approach for computing task similarity and use this measure to allocate similar tasks to the shared adapter modules. Our task similarity method introduces minimal overhead in the pipeline. We evaluate our approach by adapting the Swin Transformer for dense prediction tasks. Experiments on the PASCAL and NYUD-v2 datasets demonstrate that our approach outperforms a fully fine-tuned multi-task model while requiring only one-fifth of the trainable parameters. This approach achieves better relative improvement to single-task fine-tuning while reducing the number of trainable parameters and surpasses the current state-of-the-art methods for parameter-efficient multi-task learning.
Similar Papers
Efficient Multi-Task Modeling through Automated Fusion of Trained Models
Machine Learning (CS)
Combines smart computer programs to do many jobs.
Task-Aware Parameter-Efficient Fine-Tuning of Large Pre-Trained Models at the Edge
Machine Learning (CS)
Makes smart computer programs run on small devices.
Task-Adapter++: Task-specific Adaptation with Order-aware Alignment for Few-shot Action Recognition
CV and Pattern Recognition
Teaches computers to recognize actions from few examples.