Memorization Inheritance in Sequence-Level Knowledge Distillation for Neural Machine Translation
Verna Dankers, Vikas Raunak

TL;DR
This paper investigates how sequence-level knowledge distillation in neural machine translation causes student models to inherit memorization and hallucination tendencies from teacher models, proposing an intervention to mitigate these issues.
Contribution
It reveals the extent of memorization inheritance in SeqKD for NMT and introduces Adaptive-SeqKD to reduce memorization and hallucinations in student models.
Findings
Students memorize more than baseline models despite not seeing original data.
SeqKD amplifies hallucination rates and memorization of low-quality data.
Adaptive-SeqKD reduces memorization and hallucination in student models.
Abstract
In this work, we explore how instance-level memorization in the teacher Neural Machine Translation (NMT) model gets inherited by the student model in sequence-level knowledge distillation (SeqKD). We find that despite not directly seeing the original training data, students memorize more than baseline models (models of the same size, trained on the original data) -- 3.4% for exact matches and 57% for extractive memorization -- and show increased hallucination rates. Further, under this SeqKD setting, we also characterize how students behave on specific training data subgroups, such as subgroups with low quality and specific counterfactual memorization (CM) scores, and find that students exhibit amplified denoising on low-quality subgroups. Finally, we propose a modification to SeqKD named Adaptive-SeqKD, which intervenes in SeqKD to reduce memorization and hallucinations. Overall, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
