GTool: Graph Enhanced Tool Planning with Large Language Model

Wenjie Chen; Wenbin Li; Di Yao; Xuying Meng; Chang Gong; Jingping Bi

arXiv:2508.12725·cs.AI·August 19, 2025

GTool: Graph Enhanced Tool Planning with Large Language Model

Wenjie Chen, Wenbin Li, Di Yao, Xuying Meng, Chang Gong, Jingping Bi

PDF

Open Access 3 Reviews

TL;DR

GTool enhances large language models' tool planning by constructing dependency-aware tool graphs and predicting missing dependencies, significantly improving performance in tool selection tasks.

Contribution

It introduces GTool, a novel method that leverages tool dependency graphs and missing dependency prediction to improve LLM-based tool planning under incomplete dependency information.

Findings

01

Achieves over 29.6% performance improvement over SOTA baselines.

02

Effectively constructs request-specific tool graphs for better tool selection.

03

Seamlessly integrates with various LLM backbones without retraining.

Abstract

Tool planning with large language models (LLMs), referring to selecting, organizing, and preparing the tools necessary to complete a user request, bridges the gap between natural language understanding and task execution. However, current works treat different tools as isolated components and fail to leverage the inherent dependencies of tools, leading to invalid planning results. Since tool dependencies are often incomplete, it becomes challenging for LLMs to accurately identify the appropriate tools required by a user request, especially when confronted with a large toolset. To solve this challenge, we propose \texttt{GTool}, which is the first work aiming to enhance the tool planning ability of LLMs under incomplete dependencies. \texttt{GTool} constructs a request-specific tool graph to select tools efficiently and generate the \texttt{<graph token>} which provides sufficient…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 4Confidence 4

Strengths

1. The paper addresses a practical and important problem in tool planning — how to model and utilize dependencies among tools. The issue of accurate dependency localization is indeed a real challenge in real-world tool-using scenarios. 2. The experiments are extensive and cover multiple evaluation settings, demonstrating the authors' efforts to comprehensively assess the proposed approach.

Weaknesses

1. The construction of the tool graph appears problematic. The authors assume that the use of 𝑣𝑖(𝑗+1), depends on 𝑣𝑖𝑗, which cannot be guaranteed. The order of tools in a trajectory does not necessarily imply dependency between them. This assumption weakens the validity of the proposed graph-based representation. 2. The method section is difficult to follow due to inconsistent and overloaded notations. In addition, the mixed use of symbols such as 𝜏𝑖, 𝑡𝑖1, 𝑡𝑖∈𝜏, ti∈τ, and 𝑡𝑛∈𝑇 leads to unneces

Reviewer 02Rating 6Confidence 4

Strengths

1. The paper introduces an efficient method for the real-world problem of incomplete tool dependencies. It effectively uses a Graph Neural Network to learn the underlying tool structure and then injects this complex information into any frozen LLM using just a single, compact token. 2. The approach is validated by extensive experiments, showing it consistently outperforms existing methods. Critically, it does not require fine-tuning the LLM, which makes it highly practical, efficient, and easy t

Weaknesses

1. The method's performance depends on having good historical data to build the initial tool graph. The paper doesn't explore how well it works in 'cold-start' scenarios where such data is scarce or unavailable. 2. The training process, which involves predicting missing links in the graph, could become a bottleneck for extremely large-scale toolsets. The paper does not fully address how the method scales when the number of tools becomes massive. 3. The use of a single 'graph token' to represent

Reviewer 03Rating 6Confidence 4

Strengths

1. The paper proposes a genuinely novel and well-motivated idea — representing tool dependencies explicitly as a graph and integrating it with LLMs through a GNN encoder. This design feels natural yet surprisingly underexplored in prior work, and the authors manage to make it both elegant and technically grounded. I particularly appreciate how the paper bridges symbolic structure (graph reasoning) and LLM-based semantic planning in a coherent way. 2. One of the most impressive aspects is the rob

Weaknesses

1. The paper provides basic ablations (e.g., removing <graph token> or MDPL), but it would be interesting to see more analysis of what the GNN actually learns — for example, visualizing attention weights or node embeddings to illustrate that it truly captures tool relationships rather than acting as a generic feature compressor. 2. While the writing is overall clear, the training description could be elaborated — e.g., how the <graph token> is injected into the LLM embedding space in practice,

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Semantic Web and Ontologies