STeCa: Step-level Trajectory Calibration for LLM Agent Learning

Hanlin Wang; Jian Wang; Chak Tou Leong; Wenjie Li

arXiv:2502.14276·cs.LG·May 30, 2025

STeCa: Step-level Trajectory Calibration for LLM Agent Learning

Hanlin Wang, Jian Wang, Chak Tou Leong, Wenjie Li

PDF

Open Access 1 Repo

TL;DR

STeCa introduces a step-level trajectory calibration framework for LLM agents, improving long-horizon task performance by automatically constructing calibration trajectories through reflection and reinforcement learning.

Contribution

The paper presents a novel step-level calibration method for LLM agents, enabling automatic trajectory correction and improved robustness in complex tasks.

Findings

01

Outperforms existing methods in experiments

02

Enhances agent robustness in long-horizon tasks

03

Utilizes LLM-driven reflection for trajectory improvement

Abstract

Large language model (LLM)-based agents have shown promise in tackling complex tasks by interacting dynamically with the environment. Existing work primarily focuses on behavior cloning from expert demonstrations or preference learning through exploratory trajectory sampling. However, these methods often struggle to address long-horizon tasks, where suboptimal actions accumulate step by step, causing agents to deviate from correct task trajectories. To address this, we highlight the importance of timely calibration and the need to automatically construct calibration trajectories for training agents. We propose Step-Level Trajectory Calibration (STeCa), a novel framework for LLM agent learning. Specifically, STeCa identifies suboptimal actions through a step-level reward comparison during exploration. It constructs calibrated trajectories using LLM-driven reflection, enabling agents to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

WangHanLinHenry/STeCa
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Processing Techniques · Anomaly Detection Techniques and Applications · Natural Language Processing Techniques