Exploring Memorization in Fine-tuned Language Models
Shenglai Zeng, Yaxin Li, Jie Ren, Yiding Liu, Han Xu, Pengfei He, Yue, Xing, Shuaiqiang Wang, Jiliang Tang, Dawei Yin

TL;DR
This paper investigates how fine-tuning affects memorization in large language models, revealing task-dependent disparities and linking memorization to attention score distributions, with implications for privacy risks.
Contribution
It provides the first comprehensive analysis of memorization during fine-tuning, highlighting task disparities and theoretical explanations for memorization behaviors in LMs.
Findings
Memorization varies significantly across different fine-tuning tasks.
A strong correlation exists between memorization and attention score distribution.
Task disparity in memorization can be explained by sparse coding theory.
Abstract
Large language models (LLMs) have shown great capabilities in various tasks but also exhibited memorization of training data, raising tremendous privacy and copyright concerns. While prior works have studied memorization during pre-training, the exploration of memorization during fine-tuning is rather limited. Compared to pre-training, fine-tuning typically involves more sensitive data and diverse objectives, thus may bring distinct privacy risks and unique memorization behaviors. In this work, we conduct the first comprehensive analysis to explore language models' (LMs) memorization during fine-tuning across tasks. Our studies with open-sourced and our own fine-tuned LMs across various tasks indicate that memorization presents a strong disparity among different fine-tuning tasks. We provide an intuitive explanation of this task disparity via sparse coding theory and unveil a strong…
Peer Reviews
Decision·ICLR 2024 Conference Withdrawn Submission
The main strength of the paper lies in its simplistic approach to studying memorization in fine-tuned language models. The authors take insights from existing memorization works on pre-trained models to conduct their analysis. The differences between different tasks showcase different mechanisms that models employ for each of them. Furthermore, the short theory on the sparse coding model helps formalize a reader's intuition for the observations. Overall, I believe the paper is a positive contrib
I have a couple of questions about the experimental setup and the observations that the authors draw from their results. (a) What does x% memorization mean in the experiments? It would be great to demonstrate perfect and no memorization baselines to get a sense of the numbers. (b) How do the authors measure Idea memorization? Furthermore, how do they differentiate Idea memorization from summarization (which is the task of a summarization-tuned model)? (c) How much do hyperparameters during
1. I like the idea of simplifying the concept of memorization by linking it to the extent of information required for a given task, which aligns with the sparse coding model. 2. The paper offers a systematic analysis that spans various tasks, providing valuable insights into memorization patterns across these tasks and comparing memorization across different language models. 3. The examination of attention scores and the presentation of encoded attention maps explains the initial claim well. 4.
1. Need for a more rigorous analysis of task specificity: The authors should consider potential confounding factors, especially the influence of the pre-training data used for the model, as this could impact memorization ratios. For instance, in the final section, the authors show that the T5 base model also shows a very high memorization ratio on the summarization task of multi-news which is 110 memorization ratio and the fine-tuned T5 model only goes to 222 which is twice more. However, the en
- Understanding the memorization behavior of LMs is an important problem with ramifications for privacy and copyright considerations. Since fine-tuning plays a key role in creating usable LMs, the paper tackles an important open problem by investigating how this step impacts memorization. - Differences in the fine-tuning task can strongly influence model behavior, so investigating how different fine-tuning objectives influence memorization is important.
- The paper has severe soundness issues, making the key findings rather unreliable. 1. The paper claims, that memorization varies based on the fine-tuning task, with summarization and dialog tasks exhibiting higher memorization than QA, translation and sentiment classification. It is very likely, however, that this result is confounded by the response lengths required for the different tasks. According to Figure 2, models generate in the order of > 120 output tokens for summarization and dia
Videos
Taxonomy
TopicsSpeech and dialogue systems · Topic Modeling · Natural Language Processing Techniques
