KEEP: A KV-Cache-Centric Memory Management System for Efficient Embodied Planning

Zebin Yang; Tong Xie; Baotong Lu; Shaoshan Liu; Bo Yu; Meng Li

arXiv:2602.23592·cs.RO·March 18, 2026

KEEP: A KV-Cache-Centric Memory Management System for Efficient Embodied Planning

Zebin Yang, Tong Xie, Baotong Lu, Shaoshan Liu, Bo Yu, Meng Li

PDF

Open Access

TL;DR

KEEP introduces a KV-cache-centric memory management system that significantly improves embodied planning efficiency in large language models by reducing recomputation and balancing memory loading, achieving notable speedups and success rate improvements.

Contribution

It presents three novel algorithms for memory management that enhance efficiency and effectiveness of KV cache usage in embodied planning tasks.

Findings

01

2.68x speedup over text-based memory methods

02

4.13% success rate improvement over CacheBlend

03

1.90x reduction in time-to-first-token

Abstract

Memory-augmented Large Language Models (LLMs) have demonstrated remarkable capability for complex and long-horizon embodied planning. By keeping track of past experiences and environmental states, memory enables LLMs to maintain a global view, thereby avoiding repetitive exploration. However, existing approaches often store the memory as raw text, leading to excessively long prompts and high prefill latency. While it is possible to store and reuse the KV caches, the efficiency benefits are greatly undermined due to frequent KV cache updates. In this paper, we propose KEEP, a KV-cache-centric memory management system for efficient embodied planning. KEEP features 3 key innovations: (1) a Static-Dynamic Memory Construction algorithm that reduces KV cache recomputation by mixed-granularity memory group; (2) a Multi-hop Memory Re-computation algorithm that dynamically identifies important…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI-based Problem Solving and Planning · Multimodal Machine Learning Applications · Artificial Intelligence in Games