CoDA: A Context-Decoupled Hierarchical Agent with Reinforcement Learning

Xuanzhang Liu; Jianglun Feng; Zhuoran Zhuang; Junzhe Zhao; Maofei Que; Jieting Li; Dianlei Wang; Hao Tong; Ye Chen; Pan Li

arXiv:2512.12716·cs.CL·December 16, 2025

CoDA: A Context-Decoupled Hierarchical Agent with Reinforcement Learning

Xuanzhang Liu, Jianglun Feng, Zhuoran Zhuang, Junzhe Zhao, Maofei Que, Jieting Li, Dianlei Wang, Hao Tong, Ye Chen, Pan Li

PDF

Open Access

TL;DR

CoDA introduces a hierarchical reinforcement learning framework that decouples planning and execution in LLM agents, effectively mitigating context overload and improving performance on complex multi-hop tasks.

Contribution

It proposes a novel hierarchical RL framework with context decoupling and a joint training method, enhancing LLM agent robustness and performance in long-context scenarios.

Findings

01

Significant performance gains on multi-hop question-answering benchmarks.

02

Robustness in long-context scenarios where other models degrade.

03

Effective mitigation of context explosion through hierarchical design.

Abstract

Large Language Model (LLM) agents trained with reinforcement learning (RL) show great promise for solving complex, multi-step tasks. However, their performance is often crippled by "Context Explosion", where the accumulation of long text outputs overwhelms the model's context window and leads to reasoning failures. To address this, we introduce CoDA, a Context-Decoupled hierarchical Agent, a simple but effective reinforcement learning framework that decouples high-level planning from low-level execution. It employs a single, shared LLM backbone that learns to operate in two distinct, contextually isolated roles: a high-level Planner that decomposes tasks within a concise strategic context, and a low-level Executor that handles tool interactions in an ephemeral, isolated workspace. We train this unified agent end-to-end using PECO (Planner-Executor Co-Optimization), a reinforcement…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Reinforcement Learning in Robotics · Topic Modeling