Democratizing Tool Learning with Environments Fully Simulated by a Free 8B Language Model
Chenming Tang, Hsiu-Yuan Huang, Weijie Liu, Junqiang Zheng, Saiyong Yang, Yunfang Wu

TL;DR
This paper introduces TRUSTEE, a cost-effective approach for training tool calling agents using fully simulated environments generated by small, open-source language models, promoting accessible reinforcement learning.
Contribution
The paper presents TRUSTEE, a novel method leveraging 8B open-source LMs for environment simulation, enabling democratized and resource-efficient tool learning.
Findings
TRUSTEE outperforms baselines requiring external resources in most cases.
Simulated environments with 8B LMs can effectively train tool calling agents.
The approach democratizes tool learning by reducing resource barriers.
Abstract
Reinforcement learning (RL) has become a prevalent paradigm for training tool calling agents, which typically requires online interactive environments. Existing approaches either rely on training data with ground truth annotations or require advanced proprietary language models (LMs) to synthesize environments that keep fixed once created. In this work, we propose TRUSTEE, a cost-friendly method for training tool calling agents with dynamic environments fully simulated by free open-source LMs that can be as small as 8B, including task generation, user simulation, tool simulation and trajectory evaluation, paired with an adaptive curriculum learning mechanism that controls task difficulty during training. Our empirical results show that TRUSTEE outperforms baselines which require extra external resources in most cases. These confirm that, with a sufficiently sophisticated design, even…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
