Domain adapted machine translation: What does catastrophic forgetting forget and why?
Danielle Saunders, Steve DeNeefe

TL;DR
This paper investigates the causes of catastrophic forgetting in neural machine translation during domain adaptation, revealing that forgetting is linked to target vocabulary coverage of in-domain data, and offers insights for improved adaptation strategies.
Contribution
It provides the first detailed analysis of what is forgotten and why during NMT domain adaptation, focusing on the role of target vocabulary coverage.
Findings
Forgetting correlates with target vocabulary coverage.
Different types of in-domain data influence forgetting.
Insights inform better domain adaptation methods.
Abstract
Neural Machine Translation (NMT) models can be specialized by domain adaptation, often involving fine-tuning on a dataset of interest. This process risks catastrophic forgetting: rapid loss of generic translation quality. Forgetting has been widely observed, with many mitigation methods proposed. However, the causes of forgetting and the relationship between forgetting and adaptation data are under-explored. This paper takes a novel approach to understanding catastrophic forgetting during NMT adaptation by investigating the impact of the data. We provide a first investigation of what is forgotten, and why. We examine the relationship between forgetting and the in-domain data, and show that the amount and type of forgetting is linked to that data's target vocabulary coverage. Our findings pave the way toward better informed NMT domain adaptation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
