One Refiner to Unlock Them All: Inference-Time Reasoning Elicitation via Reinforcement Query Refinement

Yixiao Zhou; Dongzhou Cheng; zhiliang wu; Yi Yang; Yu Cheng; Hehe Fan

arXiv:2604.25444·cs.CL·April 29, 2026

One Refiner to Unlock Them All: Inference-Time Reasoning Elicitation via Reinforcement Query Refinement

Yixiao Zhou, Dongzhou Cheng, zhiliang wu, Yi Yang, Yu Cheng, Hehe Fan

PDF

1 Repo

TL;DR

ReQueR is a reinforcement learning-based framework that refines user queries into explicit logical forms at inference time, improving reasoning capabilities of large language models across various tasks and architectures.

Contribution

It introduces a novel inference-time reasoning elicitation method using reinforcement learning to train a query Refiner, enhancing reasoning without fine-tuning models.

Findings

01

ReQueR achieves 1.7%–7.2% accuracy gains across benchmarks.

02

It outperforms strong baselines by an average of 2.1%.

03

A single trained Refiner generalizes to unseen models.

Abstract

Large Language Models (LLMs) often fail to utilize their latent reasoning capabilities due to a distributional mismatch between ambiguous human inquiries and the structured logic required for machine activation. Existing alignment methods either incur prohibitive $O (N)$ costs by fine-tuning each model individually or rely on static prompts that fail to resolve query-level structural complexity. In this paper, we propose ReQueR (\textbf{Re}inforcement \textbf{Que}ry \textbf{R}efinement), a modular framework that treats reasoning elicitation as an inference-time alignment task. We train a specialized Refiner policy via Reinforcement Learning to rewrite raw queries into explicit logical decompositions, treating frozen LLMs as the environment. Rooted in the classical Zone of Proximal Development from educational psychology, we introduce the Adaptive Solver Hierarchy, a curriculum mechanism…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

newera-xiao/ReQueR
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.