Learning to Retrieve from Agent Trajectories

Yuqi Zhou; Sunhao Dai; Changle Qu; Liang Pang; Jun Xu; and Ji-Rong Wen

arXiv:2604.04949·cs.IR·April 8, 2026

Learning to Retrieve from Agent Trajectories

Yuqi Zhou, Sunhao Dai, Changle Qu, Liang Pang, Jun Xu, and Ji-Rong Wen

PDF

1 Repo 2 Models 1 Datasets

TL;DR

This paper introduces LRAT, a new training paradigm for retrieval models that leverages agent interaction data, improving retrieval effectiveness in agent-based search systems.

Contribution

It proposes a novel method to train retrieval models directly from agent trajectories, addressing the mismatch with human-centric training data.

Findings

01

LRAT improves evidence recall in various benchmarks.

02

Retrievers trained with LRAT enhance end-to-end task success.

03

The approach is effective across diverse agent architectures and scales.

Abstract

Information retrieval (IR) systems have traditionally been designed and trained for human users, with learning-to-rank methods relying heavily on large-scale human interaction logs such as clicks and dwell time. With the rapid emergence of large language model (LLM) powered search agents, however, retrieval is increasingly consumed by agents rather than human beings, and is embedded as a core component within multi-turn reasoning and action loops. In this setting, retrieval models trained under human-centric assumptions exhibit a fundamental mismatch with the way agents issue queries and consume results. In this work, we argue that retrieval models for agentic search should be trained directly from agent interaction data. We introduce learning to retrieve from agent trajectories as a new training paradigm, where supervision is derived from multi-step agent interactions. Through a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yuqi-zhou/LRAT
github

Models

Datasets

Yuqi-Zhou/LRAT-Train
dataset· 244 dl
244 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.