AgentSim: A Platform for Verifiable Agent-Trace Simulation

Saber Zerhoudi; Michael Granitzer; Jelena Mitrovic

arXiv:2604.26653·cs.IR·April 30, 2026

AgentSim: A Platform for Verifiable Agent-Trace Simulation

Saber Zerhoudi, Michael Granitzer, Jelena Mitrovic

PDF

1 Repo

TL;DR

AgentSim is an open-source platform that generates verifiable, step-by-step reasoning traces for RAG agents, enabling better grounded evaluation and analysis of large language models.

Contribution

It introduces a novel platform with mechanisms to improve trace diversity and quality, and releases a large grounded reasoning corpus for IR benchmarks.

Findings

01

Over 103,000 verifiable reasoning steps in the Agent-Trace Corpus

02

100% grounding rate on substantive answers in the corpus

03

Systematic behavioral differences in models' information seeking approaches

Abstract

Training trustworthy agentic LLMs requires data that shows the grounded reasoning process, not just the final answer. Existing datasets fall short: question-answering data is outcome-only, chain-of-thought data is not tied to specific documents, and web-agent datasets track interface actions rather than the core retrieval and synthesis steps of a RAG workflow. We introduce AgentSim, an open-source platform for simulating RAG agents. It generates verifiable, stepwise traces of agent reasoning over any document collection. AgentSim uses a policy to ensure the agent widely explores the document set. It combines a multi-model validation pipeline with an active human-in-the-loop process. This approach focuses human effort on difficult steps where models disagree. Using AgentSim, we construct and release the Agent-Trace Corpus (ATC), a large collection of grounded reasoning trajectories…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

null
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.