Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning

Guanting Dong; Yifei Chen; Xiaoxi Li; Jiajie Jin; Hongjin Qian; Yutao Zhu; Hangyu Mao; Guorui Zhou; Zhicheng Dou; Ji-Rong Wen

arXiv:2505.16410·cs.CL·May 23, 2025·3 cites

Tool-Star: Empowering LLM-Brained Multi-Tool Reasoner via Reinforcement Learning

Guanting Dong, Yifei Chen, Xiaoxi Li, Jiajie Jin, Hongjin Qian, Yutao Zhu, Hangyu Mao, Guorui Zhou, Zhicheng Dou, Ji-Rong Wen

PDF

Open Access 1 Repo 5 Models 3 Datasets

TL;DR

Tool-Star is an RL-based framework that enables large language models to autonomously invoke multiple external tools during reasoning, improving multi-tool collaboration through systematic data synthesis and a two-stage training process.

Contribution

It introduces a novel RL framework with data synthesis and training strategies to enhance multi-tool reasoning in LLMs, addressing data scarcity and collaboration challenges.

Findings

01

Significant performance improvements on 10 reasoning benchmarks.

02

Effective multi-tool collaboration demonstrated in experiments.

03

Scalable data synthesis pipeline for tool-use trajectories.

Abstract

Recently, large language models (LLMs) have shown remarkable reasoning capabilities via large-scale reinforcement learning (RL). However, leveraging the RL algorithm to empower effective multi-tool collaborative reasoning in LLMs remains an open challenge. In this paper, we introduce Tool-Star, an RL-based framework designed to empower LLMs to autonomously invoke multiple external tools during stepwise reasoning. Tool-Star integrates six types of tools and incorporates systematic designs in both data synthesis and training. To address the scarcity of tool-use data, we propose a general tool-integrated reasoning data synthesis pipeline, which combines tool-integrated prompting with hint-based sampling to automatically and scalably generate tool-use trajectories. A subsequent quality normalization and difficulty-aware classification process filters out low-quality samples and organizes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dongguanting/tool-star
pytorchOfficial

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Reinforcement Learning in Robotics