Your Dense Retriever is Secretly an Expeditious Reasoner

Yichi Zhang; Jun Bai; Zhixin Cai; Shuhan Qin; Zhuofan Chen; Jinghua Guan; Wenge Rong

arXiv:2510.21727·cs.IR·October 29, 2025

Your Dense Retriever is Secretly an Expeditious Reasoner

Yichi Zhang, Jun Bai, Zhixin Cai, Shuhan Qin, Zhuofan Chen, Jinghua Guan, Wenge Rong

PDF

4 Reviews

TL;DR

This paper introduces AdaQR, a hybrid framework that dynamically balances dense vector reasoning and LLM-based reasoning to improve retrieval efficiency and accuracy, reducing costs significantly while maintaining performance.

Contribution

The paper presents AdaQR, a novel adaptive framework that directs queries to either dense reasoning or LLM reasoning, enabling efficient and accurate retrieval.

Findings

01

Reduces reasoning cost by 28% on large-scale benchmarks.

02

Maintains or improves retrieval performance by 7%.

03

Enables controllable trade-off between efficiency and accuracy.

Abstract

Dense retrievers enhance retrieval by encoding queries and documents into continuous vectors, but they often struggle with reasoning-intensive queries. Although Large Language Models (LLMs) can reformulate queries to capture complex reasoning, applying them universally incurs significant computational cost. In this work, we propose Adaptive Query Reasoning (AdaQR), a hybrid query rewriting framework. Within this framework, a Reasoner Router dynamically directs each query to either fast dense reasoning or deep LLM reasoning. The dense reasoning is achieved by the Dense Reasoner, which performs LLM-style reasoning directly in the embedding space, enabling a controllable trade-off between efficiency and accuracy. Experiments on large-scale retrieval benchmarks BRIGHT show that AdaQR reduces reasoning cost by 28% while preserving-or even improving-retrieval performance by 7%.

Peer Reviews

Decision·ICLR 2026 Conference Withdrawn Submission

Reviewer 01Rating 2Confidence 5

Strengths

1. Provides a trainable solution for reasoning routing showcasing cost reduction on a variety of tasks 2. Showcases that reasoned queries are not random transforms of the input query

Weaknesses

1. Misleading title. It is not a single model but a multi-stage pipeline that requires: (1) pre-training a Dense Reasoner on an external corpus, (2) fine-tuning the Dense Reasoner on an in-domain dataset, (3) building an "oracle anchor" for the router from the same in-domain data, and (4) deploying and maintaining both the lightweight DR and the full, expensive LLM Reasoner in production. How is the dense retriever a secret reasoner if it has been explicitly trained to learn the mapping? 2. The

Reviewer 02Rating 6Confidence 4

Strengths

1. Training a dense reasoner to induce reasoned embedding is a very cool idea. Pairing it with a router is capable of leveraging its efficiency benefit while even improving performance. 2. It’s quite surprising to see that sometimes the dense reasoner can offer advantages over the LLM reasoner, so sometimes the queries just should not be rewritten at all, which can somewhat be captured by this dense reasoner. 3. The evaluations and ablation studies are comprehensive with promising gains. 4. T

Weaknesses

1. The Dense Reasoner requires accessing 70% BRIGHT’s ground-truth query and document as training data. The Reasoner Router requires knowing the in-domain query embedding as the Oracle Anchor beforehand. Both prevent the framework from generalizing to unseen domains. 2. Some details are missing. For example, (a) it is unclear how the cost is calculated, (b) it is unclear how to ensure the pretraining corpus has no overlap with queries from BRIGHT. Providing more details about that would make t

Reviewer 03Rating 6Confidence 4

Strengths

1. The design of the Dense Reasoner is interesting. By directly learning to approximate the embeddings of LLM-rewritten queries, it provides a lightweight rewriting mechanism 2. The Router appears capable of balancing the trade-off between the efficient but less accurate Dense Reasoner and the high-performance but costly LLM-based method.

Weaknesses

According to Figure 6, AdaQR is shown to outperform both the LLM-only and Dense Reasoner-only baselines. This is surprising, as the performance of a router-based method like AdaQR is typically bounded by its components. This result suggests that the LLM and Dense Reasoner are complementary, with the router sending queries the LLM fails on to a successful Dense Reasoner. This I believe somehow contradicts the core design, where the Dense Reasoner is trained to mimic the LLM. The paper needs a dee

Reviewer 04Rating 2Confidence 4

Strengths

1. The idea is novel. Directly navigating the query embedding space to obtain a reasoned query embedding is interesting and new. The paper provides solid empirical results demonstrating advantages in both efficiency and performance. 2. The motivation is strong. The pilot study analyzes the mean resultant length between the original query embedding and the transformed query embedding across seven different reasoners and five different embedding models, and shows strong alignment. This offers empi

Weaknesses

1. Compared to an out-of-the-box retrieval framework, the method introduces new modules that require training for each LLM reasoner/embedding model combination. 2. The method adds computational overhead. Each retrieval now includes: embedding the query for the Reasoner Router to decide between LLM-enabled query rewriting and the Dense Reasoner; if routed to the dense path, (1) embed the query and (2) apply the Dense Reasoner to the query; if routed to the LLM path, (1) generate a reasoning chain

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.