E-MIA: Exam-Style Black-Box Membership Inference Attacks against RAG Systems
Zelin Guan, Shengda Zhuo, Zeyan Li, Jinchun He, Wangjie Qiu, Zhiming Zheng, Shuqiang Huang

TL;DR
E-MIA introduces a novel black-box membership inference attack on RAG systems by converting verifiable document evidence into an exam format, enhancing detection accuracy while maintaining stealth.
Contribution
The paper presents E-MIA, a new method that uses gradable exam questions based on document evidence to improve membership inference in RAG systems.
Findings
E-MIA outperforms existing methods in member/non-member separation.
E-MIA maintains stealthiness with natural, inconspicuous queries.
Exam length and question composition influence attack effectiveness.
Abstract
Retrieval-Augmented Generation (RAG) equips large language models (LLMs) with external evidence by retrieving documents at inference time, but it also turns the retrieval corpusinto a sensitive asset. Under a black-box setting, an adversary given a candidate document can infer whether it has been ingested into the RAG knowledge base (i.e., document-level membership inference) solely from query response interactions, thereby leaking corpus coverage and the existence of sensitive topics. Existing RAG MIA methods either rely on soft signals such as semantic similarity, which often yield overlapping member/non-member score distributions and unstable thresholds, or employ explicit confirmation probes whose intent is conspicuous and thus prone to refusal and detection. We propose E-MIA, which converts verifiable hard evidence in the target document (e.g., fine-grained details, proper…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
