# Enhancing adversarial resilience in semantic caching for secure retrieval augmented generation systems

**Authors:** Mohanad Afiffy, Mohamed Waleed Fakhr, Fahima A. Maghraby

PMC · DOI: 10.1038/s41598-026-36721-w · 2026-02-11

## TL;DR

This paper introduces SAFE-CACHE, a new semantic caching method that improves security and reduces adversarial attacks in retrieval-augmented language models.

## Contribution

The novel SAFE-CACHE approach uses cluster centroids and a refined caching strategy to enhance adversarial resilience in semantic caching.

## Key findings

- SAFE-CACHE reduces adversarial attack success rates from 52.77% to 14.27% compared to GPTCache.
- The method achieves up to 72% improvement in adversarial resistance through cluster-based caching.
- Unsupervised clustering and statistical detection improve semantic validation and system reliability.

## Abstract

Large Language Models (LLMs) combined with Retrieval-Augmented Generation (RAG) frameworks greatly improve natural language processing performance, but they incur substantial computational overhead because many similar queries are processed repeatedly. To mitigate this, semantic caching has been introduced to store past responses and reuse them for semantically similar inputs, thereby reducing computation costs. Yet, semantic caching mechanisms that depend only on semantic similarity are vulnerable to adversarial exploitation: carefully engineered malicious queries with minor lexical variations can trigger incorrect cache hits, undermining both the reliability and the security of the system. This paper examines security vulnerabilities in semantic proximity caching systems such as GPTCache, a widely used open-source semantic cache that exemplifies these issues, and introduces a new approach called SAFE-CACHE, which is built to withstand adversarial attacks. SAFE-CACHE adopts a cluster-centroid-based caching strategy that is fundamentally distinct from GPTCache’s single-query embedding method. It uses unsupervised clustering of historical query–answer pairs, statistical detection of noisy clusters, bi-encoder–based refinement, and conditional cluster enrichment driven by a fine-tuned lightweight LLM to infer the underlying intent of cached queries. During runtime, incoming queries are compared to cluster centroids instead of individual cached entries, enabling stronger semantic validation and improved resilience against adversarial behavior. Our experimental evaluations demonstrate that SAFE-CACHE dramatically reduces adversarial attack success rates from 52.77% to 14.27% compared to GPTCache, representing up to 72% improvement in adversarial resistance.

## Full-text entities

- **Diseases:** poisoning (MESH:D011041)
- **Chemicals:** Gemma-3 1B (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12894985/full.md

---
Source: https://tomesphere.com/paper/PMC12894985