TongSearch-QR: Reinforced Query Reasoning for Retrieval
Xubo Qin, Jun Bai, Jiaqi Li, Zixia Jia, Zilong Zheng

TL;DR
This paper introduces TongSearch QR, a family of small-scale language models that use reinforcement learning for reasoning-based query rewriting, achieving performance comparable to large models while being more practical for deployment.
Contribution
The work develops a semi-rule-based reward function and employs reinforcement learning to enable small models to perform reasoning-intensive query rewriting effectively.
Findings
TongSearch QR models outperform existing baselines on BRIGHT benchmark.
Small models achieve reasoning performance comparable to large-scale models.
The approach offers a practical solution for reasoning-intensive retrieval tasks.
Abstract
Traditional information retrieval (IR) methods excel at textual and semantic matching but struggle in reasoning-intensive retrieval tasks that require multi-hop inference or complex semantic understanding between queries and documents. One promising solution is to explicitly rewrite or augment queries using large language models (LLMs) to elicit reasoning-relevant content prior to retrieval. However, the widespread use of large-scale language models like GPT-4 or LLaMA3-70B remains impractical due to their high inference cost and limited deployability in real-world systems. In this work, we introduce TongSearch QR (Previously Known as "TongSearch Reasoner"), a family of small-scale language models for query reasoning and rewriting in reasoning-intensive retrieval. With a novel semi-rule-based reward function, we employ reinforcement learning approaches enabling smaller language models,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
