Predicting memorization within Large Language Models fine-tuned for classification

J\'er\'emie Dentan; Davide Buscaldi; Aymen Shabou; Sonia Vanier

arXiv:2409.18858·cs.CR·July 16, 2025

Predicting memorization within Large Language Models fine-tuned for classification

J\'er\'emie Dentan, Davide Buscaldi, Aymen Shabou, Sonia Vanier

PDF

Open Access

TL;DR

This paper introduces a novel, low-cost method to detect memorized training samples in large language models during early training stages, enhancing data privacy and model robustness.

Contribution

It presents a new a priori detection approach for memorized data in fine-tuned LLMs, supported by theoretical insights and adaptable to various classification models.

Findings

01

Effective early-stage detection of memorized samples

02

Method requires low computational resources

03

Supports systematic identification of vulnerable data

Abstract

Large Language Models have received significant attention due to their abilities to solve a wide range of complex tasks. However these models memorize a significant proportion of their training data, posing a serious threat when disclosed at inference time. To mitigate this unintended memorization, it is crucial to understand what elements are memorized and why. This area of research is largely unexplored, with most existing works providing a posteriori explanations. To address this gap, we propose a new approach to detect memorized samples a priori in LLMs fine-tuned for classification tasks. This method is effective from the early stages of training and readily adaptable to other classification settings, such as training vision models from scratch. Our method is supported by new theoretical results, and requires a low computational budget. We achieve strong empirical results, paving…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques