Overview of the TREC 2022 deep learning track
By: Nick Craswell , Bhaskar Mitra , Emine Yilmaz and more
Potential Business Impact:
Finds better answers in huge amounts of text.
This is the fourth year of the TREC Deep Learning track. As in previous years, we leverage the MS MARCO datasets that made hundreds of thousands of human annotated training labels available for both passage and document ranking tasks. In addition, this year we also leverage both the refreshed passage and document collections that were released last year leading to a nearly $16$ times increase in the size of the passage collection and nearly four times increase in the document collection size. Unlike previous years, in 2022 we mainly focused on constructing a more complete test collection for the passage retrieval task, which has been the primary focus of the track. The document ranking task was kept as a secondary task, where document-level labels were inferred from the passage-level labels. Our analysis shows that similar to previous years, deep neural ranking models that employ large scale pretraining continued to outperform traditional retrieval methods. Due to the focusing our judging resources on passage judging, we are more confident in the quality of this year's queries and judgments, with respect to our ability to distinguish between runs and reuse the dataset in future. We also see some surprises in overall outcomes. Some top-performing runs did not do dense retrieval. Runs that did single-stage dense retrieval were not as competitive this year as they were last year.
Similar Papers
Overview of the TREC 2021 deep learning track
Information Retrieval
Helps computers find information better and faster.
Overview of the TREC 2023 deep learning track
Information Retrieval
Lets computers find answers better using smart text.
Overview of the TREC 2024 NeuCLIR Track
Information Retrieval
Helps computers find information in different languages.