ToolGrad: Efficient Tool-use Dataset Generation with Textual "Gradients"

Zhongyi Zhou; Kohei Uehara; Haoyu Zhang; Jingtao Zhou; Lin Gu; Ruofei Du; Zheng Xu; Tatsuya Harada

arXiv:2508.04086·cs.CL·May 4, 2026

ToolGrad: Efficient Tool-use Dataset Generation with Textual "Gradients"

Zhongyi Zhou, Kohei Uehara, Haoyu Zhang, Jingtao Zhou, Lin Gu, Ruofei Du, Zheng Xu, Tatsuya Harada

PDF

1 Repo

TL;DR

ToolGrad is a novel framework that generates complex tool-use datasets efficiently by constructing tool chains first and then synthesizing user queries, leading to high-quality data and improved model performance.

Contribution

It introduces an answer-first, gradient-guided iterative process for dataset generation, outperforming traditional methods in complexity and efficiency.

Findings

01

ToolGrad-500 dataset has higher complexity and pass rate.

02

Models trained on ToolGrad outperform those trained on baseline datasets.

03

Source code, dataset, and models are publicly available.

Abstract

Prior work synthesizes tool-use LLM datasets by first generating a user query, followed by complex tool-use annotations like depth-first search (DFS). This leads to inevitable annotation failures and low efficiency in data generation. We introduce ToolGrad, an agentic framework that inverts this paradigm. ToolGrad first constructs valid tool-use chains through an iterative process guided by textual "gradients", and then synthesizes corresponding user queries. This "answer-first" approach led to ToolGrad-500, a dataset generated with more complex tool use, lower cost, and almost 100% pass rate. Experiments show that ToolGrad models outperform those trained on expensive baseline datasets and proprietary LLMs. The ToolGrad source code, dataset, and models are available at https://github.com/zhongyi-zhou/toolgrad.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zhongyi-zhou/toolgrad
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.