STeCa: Step-level Trajectory Calibration for LLM Agent Learning
Hanlin Wang, Jian Wang, Chak Tou Leong, Wenjie Li

TL;DR
STeCa introduces a step-level trajectory calibration framework for LLM agents, improving long-horizon task performance by automatically constructing calibration trajectories through reflection and reinforcement learning.
Contribution
The paper presents a novel step-level calibration method for LLM agents, enabling automatic trajectory correction and improved robustness in complex tasks.
Findings
Outperforms existing methods in experiments
Enhances agent robustness in long-horizon tasks
Utilizes LLM-driven reflection for trajectory improvement
Abstract
Large language model (LLM)-based agents have shown promise in tackling complex tasks by interacting dynamically with the environment. Existing work primarily focuses on behavior cloning from expert demonstrations or preference learning through exploratory trajectory sampling. However, these methods often struggle to address long-horizon tasks, where suboptimal actions accumulate step by step, causing agents to deviate from correct task trajectories. To address this, we highlight the importance of timely calibration and the need to automatically construct calibration trajectories for training agents. We propose Step-Level Trajectory Calibration (STeCa), a novel framework for LLM agent learning. Specifically, STeCa identifies suboptimal actions through a step-level reward comparison during exploration. It constructs calibrated trajectories using LLM-driven reflection, enabling agents to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Processing Techniques · Anomaly Detection Techniques and Applications · Natural Language Processing Techniques
