SemanticALLI: Caching Reasoning, Not Just Responses, in Agentic Systems
Varun Chillara, Dylan Kline, Christopher Alvares, Evan Wooten, Huan Yang, Shlok Khetan, Cade Bauer, Tr\'e Guillory, Tanishka Shah, Yashodhara Dhariwal, Volodymyr Pavlov, George Popstefanov

TL;DR
SemanticALLI introduces a pipeline-aware caching architecture that significantly improves reuse of intermediate reasoning steps in agentic AI systems, reducing redundant computations and latency.
Contribution
It decomposes generation into structured IRs and elevates them as cacheable artifacts, enabling higher cache hit rates and efficiency in AI pipelines.
Findings
Achieved an 83.10% cache hit rate with structured IRs
Reduced 4,023 LLM calls and median latency to 2.66 ms
Outperformed baseline monolithic caching with 38.7% hit rate
Abstract
Agentic AI pipelines suffer from a hidden inefficiency: they frequently reconstruct identical intermediate logic, such as metric normalization or chart scaffolding, even when the user's natural language phrasing is entirely novel. Conventional boundary caching fails to capture this inefficiency because it treats inference as a monolithic black box. We introduce SemanticALLI, a pipeline-aware architecture within Alli (PMG's marketing intelligence platform), designed to operationalize redundant reasoning. By decomposing generation into Analytic Intent Resolution (AIR) and Visualization Synthesis (VS), SemanticALLI elevates structured intermediate representations (IRs) to first-class, cacheable artifacts. The impact of caching within the agentic loop is substantial. In our evaluation, baseline monolithic caching caps at a 38.7% hit rate due to linguistic variance. In contrast, our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Artificial Intelligence in Games · Explainable Artificial Intelligence (XAI)
