ContextLeak: Auditing Leakage in Private In-Context Learning Methods
Jacob Choi, Shuying Cao, Xingjian Dong, Amin Banayeeanzade, Wang Bill Zhu, Robin Jia, Sai Praneeth Karimireddy

TL;DR
ContextLeak is a framework for empirically measuring worst-case information leakage in private in-context learning, revealing privacy risks and limitations of current methods.
Contribution
It introduces the first empirical framework to detect leakage in private ICL methods using canary insertion and targeted queries.
Findings
ContextLeak reliably detects leakage across various methods.
Leakage increases with the theoretical privacy budget.
Existing methods often leak sensitive info or degrade utility.
Abstract
In-Context Learning (ICL) has become a standard technique for adapting Large Language Models (LLMs) to specialized tasks by supplying task-specific exemplars within the prompt. However, when these exemplars contain sensitive information, reliable privacy-preserving mechanisms are essential to prevent unintended leakage through model outputs. Many privacy-preserving methods have been proposed to protect against information leakage in this context, but there are fewer efforts on how to audit these methods. We introduce ContextLeak, the first framework to empirically measure the worst-case information leakage in ICL. ContextLeak uses canary insertion, embedding uniquely identifiable tokens in the sensitive dataset and crafting targeted queries to detect their presence. We apply ContextLeak across a range of private ICL techniques, including both heuristic prompt-based defenses and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
