Simulating Environments with Reasoning Models for Agent Training

Yuetai Li; Huseyin A Inan; Xiang Yue; Wei-Ning Chen; Lukas Wutschitz; Janardhan Kulkarni; Radha Poovendran; Robert Sim; Saravan Rajmohan

arXiv:2511.01824·cs.AI·November 4, 2025

Simulating Environments with Reasoning Models for Agent Training

Yuetai Li, Huseyin A Inan, Xiang Yue, Wei-Ning Chen, Lukas Wutschitz, Janardhan Kulkarni, Radha Poovendran, Robert Sim, Saravan Rajmohan

PDF

Open Access 4 Datasets

TL;DR

This paper introduces LLM-based simulation frameworks, Simia-SFT and Simia-RL, that enable scalable agent training without real environment data or APIs, improving robustness and reducing engineering effort.

Contribution

The paper presents novel frameworks for environment simulation using LLMs, allowing training without actual testbeds or environment implementations.

Findings

01

Fine-tuned models outperform baseline benchmarks.

02

Simia-RL achieves near state-of-the-art results on $ au^2$-Bench.

03

Scalable agent training without environment engineering.

Abstract

LLM agents excel in compact environments requiring deep reasoning but remain brittle when operating in broader, more complex contexts that demand robustness across diverse tools and schemas. Building bespoke environments for training is heavy, brittle, and limits progress. In this paper, we demonstrate that LLMs can simulate realistic environment feedback without access to actual testbed data or APIs. Inspired by this capability, we propose two frameworks: Simia-SFT, a pipeline that synthesizes SFT data by amplifying small seed sets into diverse trajectories in an environment-agnostic manner, and Simia-RL, a framework that enables RL training without real environment implementations through LLM-simulated feedback. Fine-tuning open models yields consistent improvements across multiple benchmarks, surpassing GPT-4o and approaching o4-mini on $τ^{2}$ -Bench. Together, Simia-SFT and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Agent Systems and Negotiation · Intelligent Tutoring Systems and Adaptive Learning · AI-based Problem Solving and Planning