SLEA-RL: Step-Level Experience Augmented Reinforcement Learning for Multi-Turn Agentic Training
Prince Zizhuang Wang, Shuli Jiang

TL;DR
SLEA-RL introduces a step-level experience retrieval framework for multi-turn LLM agents, enabling dynamic experience utilization at each decision point, leading to improved long-horizon task performance.
Contribution
It proposes a novel step-level experience retrieval method with an evolving library and clustering, enhancing multi-turn agent training beyond static experience methods.
Findings
Outperforms baseline RL methods on multi-turn benchmarks.
Effective step-level experience retrieval improves decision accuracy.
Evolves experience library through semantic analysis, not gradients.
Abstract
Large Language Model (LLM) agents have shown strong results on multi-turn tool-use tasks, yet they operate in isolation during training, failing to leverage experiences accumulated across episodes. Existing experience-augmented methods address this by organizing trajectories into retrievable libraries, but they retrieve experiences only once based on the initial task description and hold them constant throughout the episode. In multi-turn settings where observations change at every step, this static retrieval becomes increasingly mismatched as episodes progress. We propose SLEA-RL (Step-Level Experience-Augmented Reinforcement Learning), a framework that retrieves relevant experiences at each decision step conditioned on the current observation. SLEA-RL operates through three components: (i) step-level observation clustering that groups structurally equivalent environmental states for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
