Finding Sparse Structures for Domain Specific Neural Machine Translation
Jianze Liang, Chengqi Zhao, Mingxuan Wang, Xipeng Qiu, Lei Li

TL;DR
This paper introduces Prune-Tune, a novel method for domain-specific neural machine translation that learns sparse, disjoint sub-networks during fine-tuning, improving domain adaptation without degrading general performance.
Contribution
Prune-Tune is a new gradual pruning approach that creates multiple domain-specific sub-networks within a single model for effective multi-domain adaptation.
Findings
Outperforms strong baselines in target domain accuracy
Maintains general domain performance in single and multi-domain settings
Learns multiple disjoint sub-networks for different domains
Abstract
Neural machine translation often adopts the fine-tuning approach to adapt to specific domains. However, nonrestricted fine-tuning can easily degrade on the general domain and over-fit to the target domain. To mitigate the issue, we propose Prune-Tune, a novel domain adaptation method via gradual pruning. It learns tiny domain-specific sub-networks during fine-tuning on new domains. Prune-Tune alleviates the over-fitting and the degradation problem without model modification. Furthermore, Prune-Tune is able to sequentially learn a single network with multiple disjoint domain-specific sub-networks for multiple domains. Empirical experiment results show that Prune-Tune outperforms several strong competitors in the target domain test set without sacrificing the quality on the general domain in both single and multi-domain settings. The source code and data are available at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
