Democratizing Tool Learning with Environments Fully Simulated by a Free 8B Language Model

Chenming Tang; Hsiu-Yuan Huang; Weijie Liu; Junqiang Zheng; Saiyong Yang; Yunfang Wu

arXiv:2604.17739·cs.LG·May 8, 2026

Democratizing Tool Learning with Environments Fully Simulated by a Free 8B Language Model

Chenming Tang, Hsiu-Yuan Huang, Weijie Liu, Junqiang Zheng, Saiyong Yang, Yunfang Wu

PDF

TL;DR

This paper introduces TRUSTEE, a cost-effective approach for training tool calling agents using fully simulated environments generated by small, open-source language models, promoting accessible reinforcement learning.

Contribution

The paper presents TRUSTEE, a novel method leveraging 8B open-source LMs for environment simulation, enabling democratized and resource-efficient tool learning.

Findings

01

TRUSTEE outperforms baselines requiring external resources in most cases.

02

Simulated environments with 8B LMs can effectively train tool calling agents.

03

The approach democratizes tool learning by reducing resource barriers.

Abstract

Reinforcement learning (RL) has become a prevalent paradigm for training tool calling agents, which typically requires online interactive environments. Existing approaches either rely on training data with ground truth annotations or require advanced proprietary language models (LMs) to synthesize environments that keep fixed once created. In this work, we propose TRUSTEE, a cost-friendly method for training tool calling agents with dynamic environments fully simulated by free open-source LMs that can be as small as 8B, including task generation, user simulation, tool simulation and trajectory evaluation, paired with an adaptive curriculum learning mechanism that controls task difficulty during training. Our empirical results show that TRUSTEE outperforms baselines which require extra external resources in most cases. These confirm that, with a sufficiently sophisticated design, even…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.