Compute Allocation for Reasoning-Intensive Retrieval Agents

Sreeja Apparaju; Nilesh Gupta

arXiv:2603.14635·cs.IR·March 24, 2026

Compute Allocation for Reasoning-Intensive Retrieval Agents

Sreeja Apparaju, Nilesh Gupta

PDF

Open Access

TL;DR

This paper investigates how to allocate computational resources effectively in reasoning-intensive retrieval pipelines for long-horizon agents, emphasizing re-ranking over query expansion for better performance.

Contribution

It provides an empirical analysis of compute allocation strategies in retrieval pipelines, highlighting the importance of re-ranking with stronger models and deeper candidate pools.

Findings

01

Re-ranking benefits significantly from stronger models (+7.5 NDCG@10).

02

Deeper candidate pools improve re-ranking performance (+21%).

03

Query expansion shows diminishing returns beyond lightweight models (+1.1 NDCG@10).

Abstract

As agents operate over long horizons, their memory stores grow continuously, making retrieval critical to accessing relevant information. Many agent queries require reasoning-intensive retrieval, where the connection between query and relevant documents is implicit and requires inference to bridge. LLM-augmented pipelines address this through query expansion and candidate re-ranking, but introduce significant inference costs. We study computation allocation in reasoning-intensive retrieval pipelines using the BRIGHT benchmark and Gemini 2.5 model family. We vary model capacity, inference-time thinking, and re-ranking depth across query expansion and re-ranking stages. We find that re-ranking benefits substantially from stronger models (+7.5 NDCG@10) and deeper candidate pools (+21% from $k$ =10 to 100), while query expansion shows diminishing returns beyond lightweight models (+1.1…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsInformation Retrieval and Search Behavior · Multimodal Machine Learning Applications · Constraint Satisfaction and Optimization