TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use

Junjie Ye; Yilong Wu; Sixian Li; Yuming Yang; Zhiheng Xi; Tao Gui; Qi Zhang; Xuanjing Huang; Peng Wang; Zhongchao Shi; Jianping Fan; Zhengyin Du

arXiv:2412.15495·cs.CL·August 27, 2025

TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use

Junjie Ye, Yilong Wu, Sixian Li, Yuming Yang, Zhiheng Xi, Tao Gui, Qi Zhang, Xuanjing Huang, Peng Wang, Zhongchao Shi, Jianping Fan, Zhengyin Du

PDF

Open Access 1 Repo 1 Models

TL;DR

This paper introduces TL-Training, a novel task-feature-based framework that improves large language models' tool use by dynamically adjusting training focus and incorporating error-aware reward mechanisms, achieving high performance with limited data.

Contribution

The paper presents TL-Training, a new framework that enhances LLM tool use by addressing data issues, emphasizing key tokens, and optimizing error handling, outperforming existing methods with minimal data.

Findings

01

Achieves comparable or better tool-use performance with only 1,217 training points.

02

Enhances robustness in noisy environments.

03

Improves general task performance.

Abstract

Large language models (LLMs) achieve remarkable advancements by leveraging tools to interact with environments, a critical step toward generalized AI. However, the standard supervised fine-tuning (SFT) approach, which relies on large-scale datasets, often overlooks task-specific characteristics in tool use, leading to performance bottlenecks. To address this issue, we analyze three existing LLMs and uncover key insights: training data can inadvertently impede tool-use behavior, token importance is distributed unevenly, and errors in tool calls fall into a small set of categories. Building on these findings, we propose~\emph{TL-Training}, a task-feature-based framework that mitigates the effects of suboptimal training data, dynamically adjusts token weights to prioritize key tokens during SFT, and incorporates a robust reward mechanism tailored to error categories, optimized through…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

junjie-ye/tl-training
pytorchOfficial

Models

🤗
Junjie-Ye/TL-CodeLLaMA-2
model· 23 dl· ♡ 1
23 dl♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsSparse Evolutionary Training · Shrink and Fine-Tune