Hierarchical Reinforcement Learning with Augmented Step-Level Transitions for LLM Agents

Shuai Zhen; Yanhua Yu; Ruopei Guo; Nan Cheng; Yang Deng

arXiv:2604.05808·cs.AI·April 16, 2026

Hierarchical Reinforcement Learning with Augmented Step-Level Transitions for LLM Agents

Shuai Zhen, Yanhua Yu, Ruopei Guo, Nan Cheng, Yang Deng

PDF

1 Repo

TL;DR

This paper introduces STEP-HRL, a hierarchical reinforcement learning framework for LLM agents that improves efficiency and scalability by focusing on single-step transitions and local progress summaries.

Contribution

It presents a novel HRL approach that conditions on step-level transitions and uses local progress modules, enhancing performance and reducing token usage.

Findings

01

Outperforms baselines on ScienceWorld and ALFWorld benchmarks.

02

Reduces token usage while maintaining high performance.

03

Improves generalization in hierarchical RL for LLM agents.

Abstract

Large language model (LLM) agents have demonstrated strong capabilities in complex interactive decision-making tasks. However, existing LLM agents typically rely on increasingly long interaction histories, resulting in high computational cost and limited scalability. In this paper, we propose STEP-HRL, a hierarchical reinforcement learning (HRL) framework that enables step-level learning by conditioning only on single-step transitions rather than full interaction histories. STEP-HRL structures tasks hierarchically, using completed subtasks to represent global progress of overall task. By introducing a local progress module, it also iteratively and selectively summarizes interaction history within each subtask to produce a compact summary of local progress. Together, these components yield augmented step-level transitions for both high-level and low-level policies. Experimental results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

TonyStark042/STEP-HRL
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.