Trajectory Supervision for Continual Tool-Use Learning in LLMs

Vishnu Vardhan Reddy; Sagnik Chatterjee; Soumik Bhatta

arXiv:2605.09734·cs.SE·May 12, 2026

Trajectory Supervision for Continual Tool-Use Learning in LLMs

Vishnu Vardhan Reddy, Sagnik Chatterjee, Soumik Bhatta

PDF

TL;DR

This study investigates whether including tool-use trajectories during training improves language models' API call accuracy, finding that trajectory context significantly enhances performance despite increased training data.

Contribution

It introduces a method of training LLMs with tool-use trajectories and demonstrates improved API call accuracy over stripping intermediate steps.

Findings

01

Trajectory context improves API call accuracy from 39.2% to 56.9%.

02

Including trajectories increases training tokens by 25.1%.

03

Trajectory training enhances API-name accuracy by 7.7 points.

Abstract

Most language-model training data shows final artifacts, not the process that produced them. We study a tractable version of this question in tool use: when a model learns a stream of new API domains, does keeping tool-use trajectories help compared with stripping the intermediate API trace? We fine-tune Llama 3.1 8B Instruct with QLoRA on API-Bank using four sequential domain blocks. Condition A strips previous API request/response lines from the prompt and trains the model to predict the next API call. Condition B keeps the trajectory context. In a single-seed pilot, full held-out generation evaluation shows that Condition B reaches 56.9\% final exact full-call accuracy compared with 39.2\% for Condition A. B also improves final API-name accuracy by 7.7 points. However, B uses 25.1\% more training tokens, the run uses one seed, and the task is next-call prediction rather than full…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.