HiMAC: Hierarchical Macro-Micro Learning for Long-Horizon LLM Agents

Hongbo Jin; Rongpeng Zhu; Jiayu Ding; Guibo Luo; and Ge Li

arXiv:2603.00977·cs.AI·May 6, 2026

HiMAC: Hierarchical Macro-Micro Learning for Long-Horizon LLM Agents

Hongbo Jin, Rongpeng Zhu, Jiayu Ding, Guibo Luo, and Ge Li

PDF

TL;DR

HiMAC introduces a hierarchical reinforcement learning framework for LLM agents, decomposing long-horizon tasks into planning and execution, leading to improved performance and sample efficiency in complex environments.

Contribution

The paper presents a novel hierarchical RL approach with a critic-free training paradigm and iterative co-evolution strategy for better long-horizon decision-making in LLM agents.

Findings

01

HiMAC outperforms baselines on ALFWorld, WebShop, and Sokoban.

02

Achieves state-of-the-art performance and better sample efficiency.

03

Structured hierarchy is crucial for robust long-horizon agentic intelligence.

Abstract

Large language model (LLM) agents have recently demonstrated strong capabilities in interactive decision-making, yet they remain fundamentally limited in long-horizon tasks that require structured planning and reliable execution. Existing approaches predominantly rely on flat autoregressive policies, where high-level reasoning and low-level actions are generated within a single token sequence, leading to inefficient exploration and severe error propagation over extended trajectories. In this work, we propose HiMAC, a hierarchical agentic RL framework that explicitly decomposes long-horizon decision-making into macro-level planning and micro-level execution. HiMAC models reasoning as a structured blueprint generation process followed by goal-conditioned action execution, enabling robust long-horizon planning within LLM-based agents. To train this hierarchy efficiently, we introduce a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.