From Teacher to Student: Tracking Memorization Through Model Distillation
Simardeep Singh

TL;DR
This paper investigates how knowledge distillation from large language models to smaller ones impacts memorization of training data, showing that distillation reduces memorization risks while maintaining efficiency.
Contribution
It introduces an analysis of memorization in model distillation, revealing that distillation decreases memorization risks compared to fine-tuning.
Findings
Distillation reduces memorization of training data.
Smaller models retain task performance with less memorization.
Distillation lowers privacy and security risks associated with memorization.
Abstract
Large language models (LLMs) are known to memorize parts of their training data, raising important concerns around privacy and security. While previous research has focused on studying memorization in pre-trained models, much less is known about how knowledge distillation (KD) affects memorization.In this study, we explore how different KD methods influence the memorization of fine-tuned task data when a large teacher model is distilled into smaller student variants.This study demonstrates that distilling a larger teacher model, fine-tuned on a dataset, into a smaller variant not only lowers computational costs and model size but also significantly reduces the memorization risks compared to standard fine-tuning approaches.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Machine Learning and Data Classification · Explainable Artificial Intelligence (XAI)
