GTool: Graph Enhanced Tool Planning with Large Language Model
Wenjie Chen, Wenbin Li, Di Yao, Xuying Meng, Chang Gong, Jingping Bi

TL;DR
GTool enhances large language models' tool planning by constructing dependency-aware tool graphs and predicting missing dependencies, significantly improving performance in tool selection tasks.
Contribution
It introduces GTool, a novel method that leverages tool dependency graphs and missing dependency prediction to improve LLM-based tool planning under incomplete dependency information.
Findings
Achieves over 29.6% performance improvement over SOTA baselines.
Effectively constructs request-specific tool graphs for better tool selection.
Seamlessly integrates with various LLM backbones without retraining.
Abstract
Tool planning with large language models (LLMs), referring to selecting, organizing, and preparing the tools necessary to complete a user request, bridges the gap between natural language understanding and task execution. However, current works treat different tools as isolated components and fail to leverage the inherent dependencies of tools, leading to invalid planning results. Since tool dependencies are often incomplete, it becomes challenging for LLMs to accurately identify the appropriate tools required by a user request, especially when confronted with a large toolset. To solve this challenge, we propose \texttt{GTool}, which is the first work aiming to enhance the tool planning ability of LLMs under incomplete dependencies. \texttt{GTool} constructs a request-specific tool graph to select tools efficiently and generate the \texttt{<graph token>} which provides sufficientβ¦
Peer Reviews
DecisionΒ·ICLR 2026 Poster
1. The paper addresses a practical and important problem in tool planning β how to model and utilize dependencies among tools. The issue of accurate dependency localization is indeed a real challenge in real-world tool-using scenarios. 2. The experiments are extensive and cover multiple evaluation settings, demonstrating the authors' efforts to comprehensively assess the proposed approach.
1. The construction of the tool graph appears problematic. The authors assume that the use of π£π(π+1), depends on π£ππ, which cannot be guaranteed. The order of tools in a trajectory does not necessarily imply dependency between them. This assumption weakens the validity of the proposed graph-based representation. 2. The method section is difficult to follow due to inconsistent and overloaded notations. In addition, the mixed use of symbols such as ππ, π‘π1, π‘πβπ, tiβΟ, and π‘πβπ leads to unneces
1. The paper introduces an efficient method for the real-world problem of incomplete tool dependencies. It effectively uses a Graph Neural Network to learn the underlying tool structure and then injects this complex information into any frozen LLM using just a single, compact token. 2. The approach is validated by extensive experiments, showing it consistently outperforms existing methods. Critically, it does not require fine-tuning the LLM, which makes it highly practical, efficient, and easy t
1. The method's performance depends on having good historical data to build the initial tool graph. The paper doesn't explore how well it works in 'cold-start' scenarios where such data is scarce or unavailable. 2. The training process, which involves predicting missing links in the graph, could become a bottleneck for extremely large-scale toolsets. The paper does not fully address how the method scales when the number of tools becomes massive. 3. The use of a single 'graph token' to represent
1. The paper proposes a genuinely novel and well-motivated idea β representing tool dependencies explicitly as a graph and integrating it with LLMs through a GNN encoder. This design feels natural yet surprisingly underexplored in prior work, and the authors manage to make it both elegant and technically grounded. I particularly appreciate how the paper bridges symbolic structure (graph reasoning) and LLM-based semantic planning in a coherent way. 2. One of the most impressive aspects is the rob
1. The paper provides basic ablations (e.g., removing <graph token> or MDPL), but it would be interesting to see more analysis of what the GNN actually learns β for example, visualizing attention weights or node embeddings to illustrate that it truly captures tool relationships rather than acting as a generic feature compressor. 2. While the writing is overall clear, the training description could be elaborated β e.g., how the <graph token> is injected into the LLM embedding space in practice,
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques Β· Topic Modeling Β· Semantic Web and Ontologies
