RECOR: Reasoning-focused Multi-turn Conversational Retrieval Benchmark
Mohammed Ali, Abdelrahman Abdallah, Amit Agarwal, Hitesh Laxmichand Patel, Adam Jatowt

TL;DR
RECOR introduces a new benchmark for reasoning-focused multi-turn conversational retrieval, emphasizing the importance of reasoning and dialogue history in improving retrieval accuracy across diverse domains.
Contribution
The paper presents a novel benchmark with a decomposition-and-verification framework for complex queries, and demonstrates the effectiveness of reasoning-enhanced models in conversational retrieval.
Findings
Combining conversation history with reasoning doubles retrieval performance.
Reasoning-specialized models outperform dense encoders significantly.
Implicit reasoning remains challenging when logical connections are unstated.
Abstract
Existing benchmarks treat multi-turn conversation and reasoning-intensive retrieval separately, yet real-world information seeking requires both. To bridge this gap, we present a benchmark for reasoning-based conversational information retrieval comprising 707 conversations (2,971 turns) across eleven domains. To ensure quality, our Decomposition-and-Verification framework transforms complex queries into fact-grounded multi-turn dialogues through multi-level validation, where atomic facts are verified against sources and explicit retrieval reasoning is generated for each turn. Comprehensive evaluation reveals that combining conversation history with reasoning doubles retrieval performance (Baseline .236 History+Reasoning .479 nDCG@10), while reasoning-specialized models substantially outperform dense encoders. Despite these gains, further analysis highlights that implicit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Information Retrieval and Search Behavior
