TVCACHE: A Stateful Tool-Value Cache for Post-Training LLM Agents

Abhishek Vijaya Kumar; Bhaskar Kataria; Byungsoo Oh; Emaad Manzoor; Rachee Singh

arXiv:2602.10986·cs.LG·February 12, 2026

TVCACHE: A Stateful Tool-Value Cache for Post-Training LLM Agents

Abhishek Vijaya Kumar, Bhaskar Kataria, Byungsoo Oh, Emaad Manzoor, Rachee Singh

PDF

Open Access

TL;DR

TVCACHE is a novel stateful cache for post-training LLM agents that maintains sequence trees to ensure correct reuse of tool outputs, significantly reducing tool call times without harming reward performance.

Contribution

It introduces a tree-based, state-aware caching mechanism that guarantees correctness and efficiency in reusing tool outputs during LLM agent post-training.

Findings

01

Achieves up to 70% cache hit rate

02

Reduces median tool call time by up to 6.9X

03

No degradation in reward accumulation

Abstract

In RL post-training of LLM agents, calls to external tools take several seconds or even minutes, leaving allocated GPUs idle and inflating post-training time and cost. While many tool invocations repeat across parallel rollouts and could in principle be cached, naively caching their outputs for reuse is incorrect since tool outputs depend on the environment state induced by prior agent interactions. We present TVCACHE, a stateful tool-value cache for LLM agent post-training. TVCACHE maintains a tree of observed tool-call sequences and performs longest-prefix matching for cache lookups: a hit occurs only when the agent's full tool history matches a previously executed sequence, guaranteeing identical environment state. On three diverse workloads-terminal-based tasks, SQL generation, and video understanding. TVCACHE achieves cache hit rates of up to 70% and reduces median tool call…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMobile Agent-Based Network Management · Artificial Intelligence in Games · Web Data Mining and Analysis