Private-RAG: Answering Multiple Queries with LLMs while Keeping Your Data Private

Ruihan Wu; Erchi Wang; Zhiyuan Zhang; Yu-Xiang Wang

arXiv:2511.07637·cs.LG·November 12, 2025

Private-RAG: Answering Multiple Queries with LLMs while Keeping Your Data Private

Ruihan Wu, Erchi Wang, Zhiyuan Zhang, Yu-Xiang Wang

PDF

Open Access 3 Reviews

TL;DR

This paper introduces two differentially private RAG algorithms for multi-query scenarios, enabling private document retrieval with high utility across multiple queries in large language models.

Contribution

It proposes MURAG and MURAG-ADA algorithms that extend differential privacy guarantees to multi-query RAG systems, improving privacy-utility trade-offs.

Findings

01

Scalable to hundreds of queries within a practical privacy budget

02

Maintain meaningful utility while ensuring privacy

03

Effective across multiple LLMs and datasets

Abstract

Retrieval-augmented generation (RAG) enhances large language models (LLMs) by retrieving documents from an external corpus at inference time. When this corpus contains sensitive information, however, unprotected RAG systems are at risk of leaking private information. Prior work has introduced differential privacy (DP) guarantees for RAG, but only in single-query settings, which fall short of realistic usage. In this paper, we study the more practical multi-query setting and propose two DP-RAG algorithms. The first, MURAG, leverages an individual privacy filter so that the accumulated privacy loss only depends on how frequently each document is retrieved rather than the total number of queries. The second, MURAG-ADA, further improves utility by privately releasing query-specific thresholds, enabling more precise selection of relevant documents. Our experiments across multiple LLMs and…

Peer Reviews

Decision·Submitted to ICLR 2026

Reviewer 01Rating 6Confidence 1

Strengths

1. Practical and Timely Problem: The paper tackles a real-world limitation of current DP-RAG systems—scalability to multiple queries—making DP-RAG viable for production deployments in sensitive domains like healthcare or legal services. 2. Comprehensive Evaluation: The experiments span diverse datasets, LLMs, query correlation regimes, and include both utility metrics and robustness against a strong multi-query membership inference attack (Interrogation Attack). The results consistently validat

Weaknesses

I must candidly note that I am not deeply familiar with the technical nuances of differential privacy (DP), particularly advanced topics such as Rényi DP filters and individual privacy accounting. Consequently, I may have missed potential weaknesses in the paper’s privacy analysis, algorithmic design, or theoretical claims. I recommend that the reviewers with stronger expertise in DP carefully examine the correctness and novelty of the privacy guarantees and the implementation details of the pro

Reviewer 02Rating 6Confidence 4

Strengths

Strengths: 1. The paper addresses a relevant and timely problem: how to make RAG systems private when handling many user queries. 2. The idea of tracking privacy at the document level rather than per query is simple but effective, and it clearly reduces the cost of privacy accounting. 3. The two proposed variants are well motivated for different query patterns. 4. The experiments cover a good range of datasets and models, showing solid performance under realistic privacy budgets.

Weaknesses

1. Practical limitation: The proposed method is well-motivated and theoretically sound. However, in realistic deployments there are often scenarios where a single query—or a group of related queries—has high semantic overlap with a large portion of the corpus. In such settings, the thresholding mechanism would struggle to filter out many documents, so most entries would still be “touched” and thus consume privacy budget. This situation can substantially reduce the efficiency advantage of per-doc

Reviewer 03Rating 2Confidence 4

Strengths

1. The paper addresses the relevant problem of privacy protection in multi-query RAG systems. 2. The experimental evaluation covers multiple LLM models and includes privacy attack assessments.

Weaknesses

1. The evaluation focuses primarily on privacy budget consumption while providing insufficient analysis of system complexity costs introduced by per-document budget management. Maintaining individual states for large numbers of documents may introduce significant computational overhead and storage requirements. 2. The technical contribution is mainly at the engineering design level with relatively limited core algorithmic innovation. While per-document budget management and threshold screening a

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Topic Modeling