Membership Inference Attacks Against In-Context Learning

Rui Wen; Zheng Li; Michael Backes; Yang Zhang

arXiv:2409.01380·cs.CR·September 4, 2024·2 cites

Membership Inference Attacks Against In-Context Learning

Rui Wen, Zheng Li, Michael Backes, Yang Zhang

PDF

Open Access

TL;DR

This paper introduces the first membership inference attack against In-Context Learning in large language models, demonstrating high accuracy and proposing defenses to mitigate privacy risks.

Contribution

It presents novel attack strategies for ICL, evaluates their effectiveness across models, and explores defense mechanisms to enhance privacy protection.

Findings

01

Attacks achieve up to 95% accuracy in membership inference.

02

Hybrid attack outperforms individual strategies in most cases.

03

Combining multiple defenses significantly reduces privacy leakage.

Abstract

Adapting Large Language Models (LLMs) to specific tasks introduces concerns about computational efficiency, prompting an exploration of efficient methods such as In-Context Learning (ICL). However, the vulnerability of ICL to privacy attacks under realistic assumptions remains largely unexplored. In this work, we present the first membership inference attack tailored for ICL, relying solely on generated texts without their associated probabilities. We propose four attack strategies tailored to various constrained scenarios and conduct extensive experiments on four popular large language models. Empirical results show that our attacks can accurately determine membership status in most cases, e.g., 95\% accuracy advantage against LLaMA, indicating that the associated risks are much higher than those shown by existing probability-based attacks. Additionally, we propose a hybrid attack that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Machine Learning in Healthcare

MethodsLLaMA