Impact of Fine-Tuning Methods on Memorization in Large Language Models
Jie Hou, Chuxiong Wu, Lannan Luo, Qiang Zeng

TL;DR
This paper investigates how different fine-tuning methods for large language models affect memorization and privacy risks, finding prompt-based fine-tuning to be more privacy-preserving than parameter-based approaches.
Contribution
It categorizes fine-tuning methods and evaluates their memorization risks using membership inference attacks, highlighting the privacy advantages of prompt-based techniques.
Findings
Prompt-based fine-tuning has lower vulnerability to MIAs.
Memorization remains low in prompt-based methods across model scales.
Parameter-based fine-tuning is more prone to private data leakage.
Abstract
As the capabilities of pre-trained large language models (LLMs) continue to advance, the "pre-train and fine-tune" paradigm has become increasingly mainstream, leading to the development of various fine-tuning methods. However, the privacy risks arising from memorization during fine-tuning have received relatively little attention. To address this gap, we categorize popular fine-tuning approaches and assess their impact on memorization through the lens of membership inference attacks (MIAs). Our results show that, compared to parameter-based fine-tuning, prompt-based fine-tuning achieves competitive performance while exhibiting lower vulnerability to MIAs. Furthermore, prompt-based methods maintain low memorization regardless of model scale. These findings suggest that parameter-based fine-tuning is more prone to leaking private information, whereas prompt-based fine-tuning serves as a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Machine Learning and Data Classification
