Autoregressive Ranking: Bridging the Gap Between Dual and Cross Encoders
By: Benjamin Rozonoyer , Chong You , Michael Boratko and more
Potential Business Impact:
Makes computers find information better and faster.
Dual and cross encoders have long been mainstays of information retrieval (IR), but are being challenged by the emergent capabilities of LLMs. An LLM-based approach we term pointwise generative ranking - generating tokens the length of a single docID as opposed to a list in order to enable ranking via beam search - combines efficiency and expressivity benefits while leveraging the in-context capabilities of Causal Transformers. Although there is ample evidence to suggest that pretrained LLMs are well-suited for ranking, we find that the vast majority of LLM-based approaches rely on next-token prediction, a loss function which is fundamentally rank-agnostic (and especially so with pointwise supervision). In this paper, we first prove that the expressivity of pointwise generative ranking with multi-token docIDs is superior to that of dual encoders. We then propose SToICaL - a Simple Token-Item Calibrated Loss - which can incorporate rank-aware supervision at both the item and token levels within the pointwise setup. We run a suite of experiments on ranking tasks derived from WordNet (Fellbaum, 1998) and ESCI (Reddy et al., arXiv:2206.06588). Two variants of SToICaL successfully suppress the probability of invalid docID generations and improve on common ranking metrics beyond top-1 retrieval.
Similar Papers
MixLM: High-Throughput and Effective LLM Ranking via Text-Embedding Mix-Interaction
Information Retrieval
Makes search engines find things faster and better.
LLM as Explainable Re-Ranker for Recommendation System
Information Retrieval
Helps online stores show you better, clearer choices.
Selective LLM-Guided Regularization for Enhancing Recommendation Models
Information Retrieval
Guides computers to recommend better, especially for new things.