Escaping the Context Bottleneck: Active Context Curation for LLM Agents via Reinforcement Learning

Xiaozhe Li; Tianyi Lyu; Yizhao Yang; Liang Shan; Siyi Yang; Ligao Zhang; Zhuoyi Huang; Qingwen Liu; and Yang Li

arXiv:2604.11462·cs.AI·April 14, 2026

Escaping the Context Bottleneck: Active Context Curation for LLM Agents via Reinforcement Learning

Xiaozhe Li, Tianyi Lyu, Yizhao Yang, Liang Shan, Siyi Yang, Ligao Zhang, Zhuoyi Huang, Qingwen Liu, and Yang Li

PDF

TL;DR

This paper introduces a reinforcement learning-based framework that improves long-horizon task performance of LLM agents by actively curating context, reducing noise, and preserving critical information, leading to better success rates and efficiency.

Contribution

It proposes a novel decoupled architecture with a specialized policy for context management, significantly enhancing LLM agents' reasoning over extended interactions.

Findings

01

Improves success rate on WebArena from 36.4% to 41.2%.

02

Reduces token consumption by 8.8% on WebArena.

03

Achieves 57.1% success rate on DeepSearch, outperforming previous methods.

Abstract

Large Language Models (LLMs) struggle with long-horizon tasks due to the "context bottleneck" and the "lost-in-the-middle" phenomenon, where accumulated noise from verbose environments degrades reasoning over multi-turn interactions. To address this issue, we introduce a symbiotic framework that decouples context management from task execution. Our architecture pairs a lightweight, specialized policy model, ContextCurator, with a powerful frozen foundation model, TaskExecutor. Trained via reinforcement learning, ContextCurator actively reduces information entropy in the working memory. It aggressively prunes environmental noise while preserving reasoning anchors, that is, sparse data points that are critical for future deductions. On WebArena, our framework improves the success rate of Gemini-3.0-flash from 36.4% to 41.2% while reducing token consumption by 8.8% (from 47.4K to 43.3K).…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.