T-Retrievability: A Topic-Focused Approach to Measure Fair Document Exposure in Information Retrieval
By: Xuejun Chang, Zaiqiao Meng, Debasis Ganguly
Potential Business Impact:
Finds if search results unfairly hide some topics.
Retrievability of a document is a collection-based statistic that measures its expected (reciprocal) rank of being retrieved within a specific rank cut-off. A collection with uniformly distributed retrievability scores across documents is an indicator of fair document exposure. While retrievability scores have been used to quantify the fairness of exposure for a collection, in our work, we use the distribution of retrievability scores to measure the exposure bias of retrieval models. We hypothesise that an uneven distribution of retrievability scores across the entire collection may not accurately reflect exposure bias but rather indicate variations in topical relevance. As a solution, we propose a topic-focused localised retrievability measure, which we call \textit{T-Retrievability} (topic-retrievability), which first computes retrievability scores over multiple groups of topically-related documents, and then aggregates these localised values to obtain the collection-level statistics. Our analysis using this proposed T-Retrievability measure uncovers new insights into the exposure characteristics of various neural ranking models. The findings suggest that this localised measure provides a more nuanced understanding of exposure fairness, offering a more reliable approach for assessing document accessibility in IR systems.
Similar Papers
Retentive Relevance: Capturing Long-Term User Value in Recommendation Systems
Information Retrieval
Keeps you on apps longer by showing better stuff.
Re3: Learning to Balance Relevance & Recency for Temporal Information Retrieval
Information Retrieval
Finds the newest, best answers to your questions.
A Semantically-Aware Relevance Measure for Content-Based Medical Image Retrieval Evaluation
CV and Pattern Recognition
Helps doctors find similar medical images faster.