Investigating Task Arithmetic for Zero-Shot Information Retrieval
By: Marco Braga , Pranav Kasela , Alessandro Raganato and more
Potential Business Impact:
Combines AI knowledge for better search results.
Large Language Models (LLMs) have shown impressive zero-shot performance across a variety of Natural Language Processing tasks, including document re-ranking. However, their effectiveness degrades on unseen tasks and domains, largely due to shifts in vocabulary and word distributions. In this paper, we investigate Task Arithmetic, a technique that combines the weights of LLMs pre-trained on different tasks or domains via simple mathematical operations, such as addition or subtraction, to adapt retrieval models without requiring additional fine-tuning. Our method is able to synthesize diverse tasks and domain knowledge into a single model, enabling effective zero-shot adaptation in different retrieval contexts. Extensive experiments on publicly available scientific, biomedical, and multilingual datasets show that our method improves state-of-the-art re-ranking performance by up to 18% in NDCG@10 and 15% in P@10. In addition to these empirical gains, our analysis provides insights into the strengths and limitations of Task Arithmetic as a practical strategy for zero-shot learning and model adaptation. We make our code publicly available at https://github.com/DetectiveMB/Task-Arithmetic-for-ZS-IR.
Similar Papers
Task Addition and Weight Disentanglement in Closed-Vocabulary Models
Machine Learning (CS)
Lets computers learn new skills without retraining.
Leveraging Parameter Space Symmetries for Reasoning Skill Transfer in LLMs
Computation and Language
Helps AI models share smart thinking skills better.
Leveraging Submodule Linearity Enhances Task Arithmetic Performance in LLMs
Machine Learning (CS)
Merges computer brains to do many jobs.