Neural Paging: Learning Context Management Policies for Turing-Complete Agents
Liang Chen, Qi Liu

TL;DR
Neural Paging introduces a hierarchical memory management system for LLMs that optimizes context window usage, enabling more efficient long-term reasoning with theoretical guarantees and learned policies.
Contribution
The paper proposes Neural Paging, a novel hierarchical architecture with a differentiable Page Controller that manages context efficiently, reducing complexity and improving reasoning capabilities.
Findings
Theoretical analysis shows reduced asymptotic complexity from O(N^2) to O(N·K^2).
Validation confirms theoretical bounds and highlights potential for learned policies.
Identifies significant slack in theoretical guarantees, motivating further learning-based improvements.
Abstract
The proof that Large Language Models (LLMs) augmented with external read-write memory constitute a computationally universal system has established the theoretical foundation for general-purpose agents. However, existing implementations face a critical bottleneck: the finite and costly Context Window, which functions not as infinite memory but as a scarce semantic cache. In this work, we introduce \textit{Neural Paging}, a hierarchical architecture that decouples symbolic reasoning from information resource management. We formulate the \textit{Context Paging Problem (CPP)} and propose a lightweight, differentiable \textit{Page Controller} designed to approximate ``Semantic Belady's Optimality'' -- retaining tokens with high future utility under explicit assumptions on access patterns. We provide theoretical analysis showing that, under bounded context window size~, Neural Paging…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Ferroelectric and Negative Capacitance Devices · Topic Modeling
