Data Poisoning for In-context Learning

Pengfei He; Han Xu; Yue Xing; Hui Liu; Makoto Yamada; Jiliang Tang

arXiv:2402.02160·cs.CR·June 3, 2025·1 cites

Data Poisoning for In-context Learning

Pengfei He, Han Xu, Yue Xing, Hui Liu, Makoto Yamada, Jiliang Tang

PDF

Open Access 1 Video

TL;DR

This paper investigates the vulnerability of in-context learning in large language models to data poisoning attacks, introducing ICLPoison, a framework that demonstrates significant performance degradation through strategic text perturbations.

Contribution

The paper presents ICLPoison, a novel attack framework exploiting ICL mechanisms with discrete text perturbations, revealing critical security vulnerabilities in LLMs.

Findings

01

ICL performance drops significantly under attack

02

Demonstrated effectiveness on GPT-4 and other models

03

Highlights need for defense mechanisms against data poisoning

Abstract

In the domain of large language models (LLMs), in-context learning (ICL) has been recognized for its innovative ability to adapt to new tasks, relying on examples rather than retraining or fine-tuning. This paper delves into the critical issue of ICL's susceptibility to data poisoning attacks, an area not yet fully explored. We wonder whether ICL is vulnerable, with adversaries capable of manipulating example data to degrade model performance. To address this, we introduce ICLPoison, a specialized attacking framework conceived to exploit the learning mechanisms of ICL. Our approach uniquely employs discrete text perturbations to strategically influence the hidden states of LLMs during the ICL process. We outline three representative strategies to implement attacks under our framework, each rigorously evaluated across a variety of models and tasks. Our comprehensive tests, including…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Data Poisoning for In-context Learning· underline

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Digital Media Forensic Detection