Efficient Privacy-Preserving Retrieval Augmented Generation with Distance-Preserving Encryption
Huanyi Ye, Jiale Guo, Ziyao Liu, Kwok-Yan Lam

TL;DR
This paper introduces ppRAG, a novel privacy-preserving retrieval system for LLMs that uses distance-preserving encryption to protect user data in untrusted cloud environments while maintaining efficiency and accuracy.
Contribution
The paper presents CAPRISE, a new encryption scheme that preserves relative distances for secure similarity computation, and combines it with differential privacy to enhance privacy in retrieval-augmented generation.
Findings
Achieves high retrieval accuracy with strong privacy guarantees.
Maintains efficient processing throughput suitable for resource-constrained users.
Effectively defends against vector-to-text, vector analysis, and query analysis attacks.
Abstract
RAG has emerged as a key technique for enhancing response quality of LLMs without high computational cost. In traditional architectures, RAG services are provided by a single entity that hosts the dataset within a trusted local environment. However, individuals or small organizations often lack the resources to maintain data storage servers, leading them to rely on outsourced cloud storage. This dependence on untrusted third-party services introduces privacy risks. Embedding-based retrieval mechanisms, commonly used in RAG systems, are vulnerable to privacy leakage such as vector-to-text reconstruction attacks and structural leakage via vector analysis. Several privacy-preserving RAG techniques have been proposed but most existing approaches rely on partially homomorphic encryption, which incurs substantial computational overhead. To address these challenges, we propose an efficient…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCryptography and Data Security · Cloud Data Security Solutions · Privacy-Preserving Technologies in Data
