Memorization in Fine-Tuned Large Language Models
Danil Savine

TL;DR
This paper explores how fine-tuning large language models, especially in sensitive domains like medicine, affects their tendency to memorize training data, highlighting the impact of different fine-tuning techniques on privacy risks.
Contribution
It provides a detailed analysis of factors influencing memorization in fine-tuned LLMs, including the roles of specific weight matrices and fine-tuning parameters, with practical insights for privacy-aware model adaptation.
Findings
Value and Output matrices contribute more to memorization.
Lower perplexity correlates with increased memorization.
Higher LoRA ranks increase memorization with diminishing returns.
Abstract
This study investigates the mechanisms and factors influencing memorization in fine-tuned large language models (LLMs), with a focus on the medical domain due to its privacy-sensitive nature. We examine how different aspects of the fine-tuning process affect a model's propensity to memorize training data, using the PHEE dataset of pharmacovigilance events. Our research employs two main approaches: a membership inference attack to detect memorized data, and a generation task with prompted prefixes to assess verbatim reproduction. We analyze the impact of adapting different weight matrices in the transformer architecture, the relationship between perplexity and memorization, and the effect of increasing the rank in low-rank adaptation (LoRA) fine-tuning. Key findings include: (1) Value and Output matrices contribute more significantly to memorization compared to Query and Key…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
