Revisiting Feedback Models for HyDE
By: Nour Jedidi, Jimmy Lin
Potential Business Impact:
Makes search engines find better answers using smart words.
Recent approaches that leverage large language models (LLMs) for pseudo-relevance feedback (PRF) have generally not utilized well-established feedback models like Rocchio and RM3 when expanding queries for sparse retrievers like BM25. Instead, they often opt for a simple string concatenation of the query and LLM-generated expansion content. But is this optimal? To answer this question, we revisit and systematically evaluate traditional feedback models in the context of HyDE, a popular method that enriches query representations with LLM-generated hypothetical answer documents. Our experiments show that HyDE's effectiveness can be substantially improved when leveraging feedback algorithms such as Rocchio to extract and weight expansion terms, providing a simple way to further enhance the accuracy of LLM-based PRF methods.
Similar Papers
A Little More Like This: Text-to-Image Retrieval with Vision-Language Models Using Relevance Feedback
CV and Pattern Recognition
Improves image search by learning from results.
Pseudo Relevance Feedback is Enough to Close the Gap Between Small and Large Dense Retrieval Models
Information Retrieval
Makes small AI search better than big AI.
Never Come Up Empty: Adaptive HyDE Retrieval for Improving LLM Developer Support
Software Engineering
Makes computer helpers give better, true answers.