PRAG: End-to-End Privacy-Preserving Retrieval-Augmented Generation

Zhijun Li; Minghui Xu; Huayi Qi; Wenxuan Yu; Tingchuang Zhang; Qiao Zhang; GuangYong Shang; Zhen Ma; Xiuzhen Cheng

arXiv:2604.26525·cs.CR·May 1, 2026

PRAG: End-to-End Privacy-Preserving Retrieval-Augmented Generation

Zhijun Li, Minghui Xu, Huayi Qi, Wenxuan Yu, Tingchuang Zhang, Qiao Zhang, GuangYong Shang, Zhen Ma, Xiuzhen Cheng

PDF

TL;DR

PRAG is a novel system enabling privacy-preserving retrieval-augmented generation that maintains high retrieval quality and scalability while ensuring end-to-end confidentiality of data.

Contribution

PRAG introduces a dual-mode architecture with innovative techniques like OEE to achieve secure, scalable, and accurate RAG without compromising privacy.

Findings

01

PRAG achieves 72.45%-74.45% recall on large datasets.

02

PRAG maintains practical retrieval latency.

03

PRAG demonstrates resilience against graph reconstruction attacks.

Abstract

Retrieval-Augmented Generation (RAG) is essential for enhancing Large Language Models (LLMs) with external knowledge, but its reliance on cloud environments exposes sensitive data to privacy risks. Existing privacy-preserving solutions often sacrifice retrieval quality due to noise injection or only provide partial encryption. We propose PRAG, an end-to-end privacy-preserving RAG system that achieves end-to-end confidentiality for both documents and queries without sacrificing the scalability of cloud-hosted RAG. PRAG features a dual-mode architecture: a non-interactive PRAG-I utilizes homomorphic-friendly approximations for low-latency retrieval, while an interactive PRAG-II leverages client assistance to match the accuracy of non-private RAG. To ensure robust semantic ordering, we introduce Operation-Error Estimation (OEE), a mechanism that stabilizes ranking against homomorphic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.