Retrieval-Confused Generation is a Good Defender for Privacy Violation Attack of Large Language Models
Wanli Peng, Xin Chen, Hang Fu, XinYu He, Xue Yiming, Juan Wen

TL;DR
This paper introduces a retrieval-confused generation method that effectively defends large language models against privacy violation attacks by generating misleading responses through a novel retrieval strategy, improving privacy protection without high inference costs.
Contribution
The paper proposes a novel retrieval-confused generation framework that covertly defends against privacy attacks by rewriting queries and retrieving irrelevant data, enhancing privacy without costly inference or exposing defense strategies.
Findings
Effective in defending against privacy violation attacks
Outperforms existing anonymization methods in experiments
Works across multiple datasets and large language models
Abstract
Recent advances in large language models (LLMs) have made a profound impact on our society and also raised new security concerns. Particularly, due to the remarkable inference ability of LLMs, the privacy violation attack (PVA), revealed by Staab et al., introduces serious personal privacy issues. Existing defense methods mainly leverage LLMs to anonymize the input query, which requires costly inference time and cannot gain satisfactory defense performance. Moreover, directly rejecting the PVA query seems like an effective defense method, while the defense method is exposed, promoting the evolution of PVA. In this paper, we propose a novel defense paradigm based on retrieval-confused generation (RCG) of LLMs, which can efficiently and covertly defend the PVA. We first design a paraphrasing prompt to induce the LLM to rewrite the "user comments" of the attack query to construct a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
