Re-Initialization Token Learning for Tool-Augmented Large Language Models
Chenghao Li, Liu Liu, Baosheng Yu, Jiayan Qiu, Yibing Zhan

TL;DR
This paper introduces a novel re-initialization token learning method that aligns tool tokens with word embeddings, significantly improving tool integration and problem-solving in large language models across various tasks.
Contribution
The proposed method enhances tool token learning by aligning tool embeddings with word space, leading to better tool call accuracy and improved performance on complex reasoning tasks.
Findings
Improved tool call accuracy across tasks
Enhanced performance on numerical reasoning and QA
Outperforms recent baselines like CoT and REACT
Abstract
Large language models have demonstrated exceptional performance, yet struggle with complex tasks such as numerical reasoning, plan generation. Integrating external tools, such as calculators and databases, into large language models (LLMs) is crucial for enhancing problem-solving capabilities. Current methods assign a unique token to each tool, enabling LLMs to call tools through token prediction-similar to word generation. However, this approach fails to account for the relationship between tool and word tokens, limiting adaptability within pre-trained LLMs. To address this issue, we propose a novel token learning method that aligns tool tokens with the existing word embedding space from the perspective of initialization, thereby enhancing model performance. We begin by constructing prior token embeddings for each tool based on the tool's name or description, which are used to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
