Bit-Flip Vulnerability of Shared KV-Cache Blocks in LLM Serving Systems

Yuji Yamamoto; Satoshi Matsuura

arXiv:2604.17249·cs.CR·April 21, 2026

Bit-Flip Vulnerability of Shared KV-Cache Blocks in LLM Serving Systems

Yuji Yamamoto, Satoshi Matsuura

PDF

TL;DR

This paper investigates a new security vulnerability in shared KV-cache blocks of LLM serving systems, revealing silent divergence, selective propagation, and persistent damage growth, and proposes a checksum-based detection countermeasure.

Contribution

It identifies a novel bit-flip vulnerability in shared KV-cache blocks of LLM systems and introduces an effective checksum-based detection method.

Findings

01

Silent divergence affects 13 of 16 BF16 bit positions.

02

Only requests sharing the targeted prefix are affected.

03

The proposed checksum detects single-bit corruption with negligible overhead.

Abstract

Rowhammer on GPU DRAM has enabled adversarial bit flips in model weights; shared KV-cache blocks in LLM serving systems present an analogous but previously unexamined target. In vLLM's Prefix Caching, these blocks exist as a single physical copy without integrity protection. Using software fault injection under ideal bit targeting, we characterize worst-case severity and identify three properties: (1) Silent divergence - 13 of 16 BF16 bit positions produce coherent but altered outputs, indistinguishable from legitimate responses without a clean baseline. (2) Selective propagation - only requests sharing the targeted prefix are affected. (3) Persistent accumulation - no temporal decay occurs, so cumulative damage grows linearly with subsequent requests. Together, these constitute a threat profile distinct from weight corruption: silent divergence and selective propagation enable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.