Uncovering Latent Memories: Assessing Data Leakage and Memorization   Patterns in Frontier AI Models

Sunny Duan; Mikail Khona; Abhiram Iyer; Rylan Schaeffer; Ila R Fiete

arXiv:2406.14549·cs.CV·July 26, 2024

Uncovering Latent Memories: Assessing Data Leakage and Memorization Patterns in Frontier AI Models

Sunny Duan, Mikail Khona, Abhiram Iyer, Rylan Schaeffer, Ila R Fiete

PDF

Open Access

TL;DR

This paper investigates how large AI models memorize and leak sensitive data, revealing the phenomenon of latent memorization that can occur without repeated exposure, and proposes a diagnostic method to detect such hidden memorization.

Contribution

It introduces the concept of latent memorization in AI models, demonstrating its occurrence during training and developing a diagnostic test to detect hidden data leakage.

Findings

01

Memorization probability scales logarithmically with data frequency.

02

Latent memorization can occur without repeated data exposure.

03

A diagnostic test effectively uncovers hidden memorized sequences.

Abstract

Frontier AI systems are making transformative impacts across society, but such benefits are not without costs: models trained on web-scale datasets containing personal and private data raise profound concerns about data privacy and security. Language models are trained on extensive corpora including potentially sensitive or proprietary information, and the risk of data leakage - where the model response reveals pieces of such information - remains inadequately understood. Prior work has investigated what factors drive memorization and have identified that sequence complexity and the number of repetitions drive memorization. Here, we focus on the evolution of memorization over training. We begin by reproducing findings that the probability of memorizing a sequence scales logarithmically with the number of times it is present in the data. We next show that sequences which are apparently…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Data Quality and Management

MethodsFocus