From Similarity to Vulnerability: Key Collision Attack on LLM Semantic Caching
Zhixiang Zhang, Zesen Liu, Yuchong Xie, Quanfeng Huang, Dongdong She

TL;DR
This paper reveals that semantic caching in large language models is vulnerable to key collision attacks due to an inherent trade-off between cache efficiency and security, demonstrating practical attack methods and potential mitigation strategies.
Contribution
It introduces CacheAttack, the first systematic framework for black-box collision attacks on semantic cache keys in LLMs, highlighting security vulnerabilities and proposing defenses.
Findings
CacheAttack achieves 86% hit rate in hijacking LLM responses.
Collision attacks can induce malicious behaviors in LLM agents.
Vulnerabilities transfer across different embedding models.
Abstract
Semantic caching has emerged as a pivotal technique for scaling LLM applications, widely adopted by major providers including AWS and Microsoft. By utilizing semantic embedding vectors as cache keys, this mechanism effectively minimizes latency and redundant computation for semantically similar queries. In this work, we conceptualize semantic cache keys as a form of fuzzy hashes. We demonstrate that the locality required to maximize cache hit rates fundamentally conflicts with the cryptographic avalanche effect necessary for collision resistance. Our conceptual analysis formalizes this inherent trade-off between performance (locality) and security (collision resilience), revealing that semantic caching is naturally vulnerable to key collision attacks. While prior research has focused on side-channel and privacy risks, we present the first systematic study of integrity risks arising…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSecurity and Verification in Computing · Distributed systems and fault tolerance · Caching and Content Delivery
