AgenticCache: Cache-Driven Asynchronous Planning for Embodied AI Agents

Hojoon Kim; Yuheng Wu; Thierry Tambe

arXiv:2604.24039·cs.LG·April 28, 2026

AgenticCache: Cache-Driven Asynchronous Planning for Embodied AI Agents

Hojoon Kim, Yuheng Wu, Thierry Tambe

PDF

1 Repo

TL;DR

AgenticCache leverages plan locality in embodied AI tasks to reuse cached plans, significantly reducing latency and costs while maintaining high success rates.

Contribution

Introduces AgenticCache, a framework that reuses cached plans and asynchronously updates them, reducing LLM calls and improving efficiency in embodied AI planning.

Findings

01

22% increase in task success rate on average

02

65% reduction in simulation latency

03

50% decrease in token usage

Abstract

Embodied AI agents increasingly rely on large language models (LLMs) for planning, yet per-step LLM calls impose severe latency and cost. In this paper, we show that embodied tasks exhibit strong plan locality, where the next plan is largely predictable from the current one. Building on this, we introduce AgenticCache, a planning framework that reuses cached plans to avoid per-step LLM calls. In AgenticCache, each agent queries a runtime cache of frequent plan transitions, while a background Cache Updater asynchronously calls the LLM to validate and refine cached entries. Across four multi-agent embodied benchmarks, AgenticCache improves task success rate by 22% on average across 12 configurations (4 benchmarks x 3 models), reduces simulation latency by 65%, and lowers token usage by 50%. Cache-based plan reuse thus offers a practical path to low-latency, low-cost embodied agents. Code…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hojoonleokim/MLSys26_AgenticCache
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.