Advancing Tool-Augmented Large Language Models: Integrating Insights from Errors in Inference Trees
Sijia Chen, Yibo Wang, Yi-Feng Wu, Qing-Guo Chen, Zhao Xu, Weihua Luo,, Kaifu Zhang, Lijun Zhang

TL;DR
This paper introduces TP-LLaMA, a novel framework that leverages both successful and failed inference trajectories in tool-augmented LLMs to improve reasoning performance and generalization across complex tasks.
Contribution
It proposes a preference learning-based inference trajectory optimization method that utilizes failed paths, enhancing the learning and reasoning capabilities of tool-augmented LLMs.
Findings
TP-LLaMA outperforms baselines in most test scenarios.
It demonstrates improved generalization to unseen APIs.
It achieves higher reasoning efficiency.
Abstract
Tool-augmented large language models (LLMs) leverage tools, often in the form of APIs, to improve their reasoning capabilities on complex tasks. This enables them to act as intelligent agents interacting with the real world. The recently introduced ToolLLaMA model by Qin et al. [2023] utilizes the depth-first search-based decision tree (DFSDT) mechanism for multi-step reasoning with real-world APIs, effectively enhancing the performance of tool-augmented LLMs compared to traditional chain reasoning mechanisms. However, their approach only employs successful paths from decision trees (also called inference trees) for supervised fine-tuning (SFT), missing out on the potential learning opportunities from failed paths. Inspired by this, we propose an inference trajectory optimization framework based on preference learning to address this limitation. We first introduce a novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
