CSR: Infinite-Horizon Real-Time Policies with Massive Cached State Representations
Robin Karlsson, Go Suzui

TL;DR
This paper introduces CSR and ASR frameworks that enable massive language models to operate in real-time robotics, significantly reducing latency and maintaining performance over infinite horizons.
Contribution
The paper formalizes optimal task structures for real-time LLMs in robotics and proposes practical CSR and ASR methods to achieve low latency and infinite-horizon operation.
Findings
26-fold latency reduction on a physical robot with 120K token context
State-of-the-art recall (0.836) on embodied AI benchmark
ASR maintains spike-free latency over multiple eviction cycles
Abstract
Deploying massive large language models (LLMs) as continuous cognitive engines for robotics is bottlenecked by the time-to-first-token (TTFT) latency required to process extensive state histories. Existing solutions like RAG or sliding windows compromise global context or incur prohibitive re-computation costs. We formalize the optimal task structure for minimizing latency and theoretically prove that prefix stability, incremental extensibility, and asynchronous state reconciliation are necessary conditions for real-time performance. Building on these proofs, we introduce the Cached State Representation (CSR) framework as the practical instantiation of these properties, ensuring optimal KV-cache reuse. To sustain these properties over infinite horizons, we further propose an Asynchronous State Reconciliation (ASR) algorithm that offloads state memory eviction to a parallel computational…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
