Data Defenses Against Large Language Models

William Agnew; Harry H. Jiang; Cella Sum; Maarten Sap; Sauvik Das

arXiv:2410.13138·cs.CL·October 18, 2024

Data Defenses Against Large Language Models

William Agnew, Harry H. Jiang, Cella Sum, Maarten Sap, Sauvik Das

PDF

Open Access 1 Repo

TL;DR

This paper introduces 'data defenses', a novel method for data owners to generate adversarial prompts that prevent large language models from accurately inferring sensitive or copyrighted information, thereby empowering data sovereignty.

Contribution

The paper develops a new technique to automatically create adversarial prompt injections that block LLM inference on protected data, addressing ethical and security concerns.

Findings

01

Data defenses significantly reduce LLM inference accuracy.

02

The method is effective against commercial and open-source LLMs.

03

Data defenses are cheap, fast, and resistant to countermeasures.

Abstract

Large language models excel at performing inference over text to extract information, summarize information, or generate additional text. These inference capabilities are implicated in a variety of ethical harms spanning surveillance, labor displacement, and IP/copyright theft. While many policy, legal, and technical mitigations have been proposed to counteract these harms, these mitigations typically require cooperation from institutions that move slower than technical advances (i.e., governments) or that have few incentives to act to counteract these harms (i.e., the corporations that create and profit from these LLMs). In this paper, we define and build "data defenses" -- a novel strategy that directly empowers data owners to block LLMs from performing inference on their data. We create data defenses by developing a method to automatically generate adversarial prompt injections that,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wagnew3/llmdatadefenses
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques