Revisiting Feedback Models for HyDE
Nour Jedidi, Jimmy Lin

TL;DR
This paper evaluates traditional feedback models like Rocchio within the HyDE framework, demonstrating that their integration significantly improves LLM-based pseudo-relevance feedback for document retrieval.
Contribution
It systematically assesses the use of classical feedback algorithms in HyDE, revealing their potential to enhance LLM-driven query expansion methods.
Findings
Rocchio improves HyDE's retrieval accuracy
Traditional feedback models outperform simple concatenation
Enhanced feedback methods lead to better query expansion
Abstract
Recent approaches that leverage large language models (LLMs) for pseudo-relevance feedback (PRF) have generally not utilized well-established feedback models like Rocchio and RM3 when expanding queries for sparse retrievers like BM25. Instead, they often opt for a simple string concatenation of the query and LLM-generated expansion content. But is this optimal? To answer this question, we revisit and systematically evaluate traditional feedback models in the context of HyDE, a popular method that enriches query representations with LLM-generated hypothetical answer documents. Our experiments show that HyDE's effectiveness can be substantially improved when leveraging feedback algorithms such as Rocchio to extract and weight expansion terms, providing a simple way to further enhance the accuracy of LLM-based PRF methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior · Topic Modeling · Expert finding and Q&A systems
