PIG: Privacy Jailbreak Attack on LLMs via Gradient-based Iterative In-Context Optimization
Yidan Wang, Yanan Cao, Yubing Ren, Fang Fang, Zheng Lin, Binxing Fang

TL;DR
This paper introduces PIG, a gradient-based iterative framework that effectively extracts sensitive PII from LLMs, revealing significant privacy risks and surpassing existing jailbreak methods in privacy leakage scenarios.
Contribution
PIG is a novel framework that combines in-context learning and gradient strategies to improve privacy breach effectiveness in LLMs, addressing limitations of current jailbreak techniques.
Findings
PIG outperforms baseline jailbreak methods in extracting PII.
Experiments show PIG achieves state-of-the-art results across multiple LLMs.
Significant privacy risks are demonstrated in current LLMs.
Abstract
Large Language Models (LLMs) excel in various domains but pose inherent privacy risks. Existing methods to evaluate privacy leakage in LLMs often use memorized prefixes or simple instructions to extract data, both of which well-alignment models can easily block. Meanwhile, Jailbreak attacks bypass LLM safety mechanisms to generate harmful content, but their role in privacy scenarios remains underexplored. In this paper, we examine the effectiveness of jailbreak attacks in extracting sensitive information, bridging privacy leakage and jailbreak attacks in LLMs. Moreover, we propose PIG, a novel framework targeting Personally Identifiable Information (PII) and addressing the limitations of current jailbreak methods. Specifically, PIG identifies PII entities and their types in privacy queries, uses in-context learning to build a privacy context, and iteratively updates it with three…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning · Big Data and Digital Economy
