Code-Switching for Enhancing NMT with Pre-Specified Translation
Kai Song, Yue Zhang, Heng Yu, Weihua Luo, Kun Wang, Min Zhang

TL;DR
This paper introduces a data augmentation technique for neural machine translation that uses code-switched training data to improve translation of constrained words, outperforming existing methods without modifying the model or decoding process.
Contribution
The proposed method creates code-switched training data by replacing source phrases with target translations, enhancing NMT performance on constrained words without affecting overall translation quality.
Findings
Consistent improvements over existing constrained translation methods.
Enhanced translation of constrained words without degrading unconstrained word translation.
Method does not require changes to the NMT model or decoding algorithm.
Abstract
Leveraging user-provided translation to constrain NMT has practical significance. Existing methods can be classified into two main categories, namely the use of placeholder tags for lexicon words and the use of hard constraints during decoding. Both methods can hurt translation fidelity for various reasons. We investigate a data augmentation method, making code-switched training data by replacing source phrases with their target translations. Our method does not change the MNT model or decoding algorithm, allowing the model to learn lexicon translations by copying source-side target words. Extensive experiments show that our method achieves consistent improvements over existing approaches, improving translation of constrained words without hurting unconstrained words.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
