QueStER: Query Specification for Generative keyword-based Retrieval
Arthur Satouf, Yuxuan Zong, Habiboulaye Amadou-Boubacar, Pablo Piantanida, Benjamin Piwowarski

TL;DR
QueStER introduces a method that uses a lightweight language model to generate explicit keyword queries from user inputs, enhancing retrieval performance while maintaining efficiency and scalability.
Contribution
It bridges generative retrieval and query reformulation by training a model to produce keyword queries, improving retrieval accuracy across domains.
Findings
Outperforms BM25 in in- and out-of-domain tests
Maintains efficiency comparable to lexical index retrieval
Competitive with neural IR baselines
Abstract
Generative retrieval (GR) differs from the traditional index-then-retrieve pipeline by storing relevance in model parameters and generating retrieval cues directly from the query, but it can be brittle out of domain and expensive to scale. We introduce QueStER (QUEry SpecificaTion for gEnerative Keyword-Based Retrieval), which bridges GR and query reformulation by learning to generate explicit keyword-based search specifications. Given a user query, a lightweight LLM produces a keyword query that is executed by a standard retriever (BM25), combining the generalization benefits of generative query rewriting with the efficiency and scalability of lexical indexing. We train the rewriting policy with reinforcement learning techniques. Across in- and out-of-domain evaluations, QueStER consistently improves over BM25 and is competitive with neural IR baselines, while maintaining strong…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsInformation Retrieval and Search Behavior · Topic Modeling · Image Retrieval and Classification Techniques
