Active Context Compression: Autonomous Memory Management in LLM Agents
Nikhil Verma

TL;DR
This paper introduces Focus, an autonomous memory management system for LLM agents that selectively compresses interaction history to reduce computational costs while maintaining accuracy.
Contribution
It presents a novel agent-centric architecture inspired by biological strategies, enabling LLM agents to autonomously decide when to compress context for efficiency.
Findings
Achieved 22.7% token reduction without accuracy loss
Performed an average of 6 autonomous compressions per task
Token savings up to 57% on individual instances
Abstract
Large Language Model (LLM) agents struggle with long-horizon software engineering tasks due to "Context Bloat." As interaction history grows, computational costs explode, latency increases, and reasoning capabilities degrade due to distraction by irrelevant past errors. Existing solutions often rely on passive, external summarization mechanisms that the agent cannot control. This paper proposes Focus, an agent-centric architecture inspired by the biological exploration strategies of Physarum polycephalum (slime mold). The Focus Agent autonomously decides when to consolidate key learnings into a persistent "Knowledge" block and actively withdraws (prunes) the raw interaction history. Using an optimized scaffold matching industry best practices (persistent bash + string-replacement editor), we evaluated Focus on N=5 context-intensive instances from SWE-bench Lite using Claude Haiku 4.5.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSlime Mold and Myxomycetes Research · Scientific Computing and Data Management · Multi-Agent Systems and Negotiation
