Escaping the Context Bottleneck: Active Context Curation for LLM Agents via Reinforcement Learning
Xiaozhe Li, Tianyi Lyu, Yizhao Yang, Liang Shan, Siyi Yang, Ligao Zhang, Zhuoyi Huang, Qingwen Liu, and Yang Li

TL;DR
This paper introduces a reinforcement learning-based framework that improves long-horizon task performance of LLM agents by actively curating context, reducing noise, and preserving critical information, leading to better success rates and efficiency.
Contribution
It proposes a novel decoupled architecture with a specialized policy for context management, significantly enhancing LLM agents' reasoning over extended interactions.
Findings
Improves success rate on WebArena from 36.4% to 41.2%.
Reduces token consumption by 8.8% on WebArena.
Achieves 57.1% success rate on DeepSearch, outperforming previous methods.
Abstract
Large Language Models (LLMs) struggle with long-horizon tasks due to the "context bottleneck" and the "lost-in-the-middle" phenomenon, where accumulated noise from verbose environments degrades reasoning over multi-turn interactions. To address this issue, we introduce a symbiotic framework that decouples context management from task execution. Our architecture pairs a lightweight, specialized policy model, ContextCurator, with a powerful frozen foundation model, TaskExecutor. Trained via reinforcement learning, ContextCurator actively reduces information entropy in the working memory. It aggressively prunes environmental noise while preserving reasoning anchors, that is, sparse data points that are critical for future deductions. On WebArena, our framework improves the success rate of Gemini-3.0-flash from 36.4% to 41.2% while reducing token consumption by 8.8% (from 47.4K to 43.3K).…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
