TL;DR
TRACE is an environment-specific, capability-targeted training system for LLM agents that automatically identifies and trains on lacking capabilities, leading to significant performance improvements across various tasks.
Contribution
The paper introduces TRACE, a novel end-to-end system that automatically identifies capability gaps and trains agents on synthetic environments to improve performance.
Findings
TRACE improves performance by +14.1 points on $ au^2$-bench.
TRACE achieves +7 perfect scores on ToolSandbox.
TRACE scales more efficiently than baseline methods.
Abstract
Large Language Models (LLMs) deployed in agentic environments must exercise multiple capabilities across different task instances, where a capability is performing one or more actions in a trajectory that are necessary for successfully solving a subset of tasks in the environment. Many existing approaches either rely on synthetic training data that is not targeted to the model's actual capability deficits in the target environment or train directly on the target environment, where the model needs to implicitly learn the capabilities across tasks. We introduce TRACE (Turning Recurrent Agent failures into Capability-targeted training Environments), an end-to-end system for environment-specific agent self-improvement. TRACE contrasts successful and failed trajectories to automatically identify lacking capabilities, synthesizes a targeted training environment for each that rewards whether…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
