Extracting Memorized Training Data via Decomposition

Ellen Su; Anu Vellore; Amy Chang; Raffaele Mura; Blaine Nelson; Paul; Kassianik; Amin Karbasi

arXiv:2409.12367·cs.LG·October 3, 2024

Extracting Memorized Training Data via Decomposition

Ellen Su, Anu Vellore, Amy Chang, Raffaele Mura, Blaine Nelson, Paul, Kassianik, Amin Karbasi

PDF

Open Access

TL;DR

This paper presents a simple, query-based decomposition method to extract training data from large language models, revealing potential security and privacy vulnerabilities without modifying the models.

Contribution

It introduces a novel, generalizable technique for extracting training data from LLMs through instruction decomposition, highlighting security risks.

Findings

01

Successfully extracted verbatim sentences from news articles

02

Revealed that LLMs can reproduce source training data

03

Method does not require fine-tuning or model modification

Abstract

The widespread use of Large Language Models (LLMs) in society creates new information security challenges for developers, organizations, and end-users alike. LLMs are trained on large volumes of data, and their susceptibility to reveal the exact contents of the source training datasets poses security and safety risks. Although current alignment procedures restrict common risky behaviors, they do not completely prevent LLMs from leaking data. Prior work demonstrated that LLMs may be tricked into divulging training data by using out-of-distribution queries or adversarial techniques. In this paper, we demonstrate a simple, query-based decompositional method to extract news articles from two frontier LLMs. We use instruction decomposition techniques to incrementally extract fragments of training data. Out of 3723 New York Times articles, we extract at least one verbatim sentence from 73…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIntelligent Tutoring Systems and Adaptive Learning