Sub-Word Alignment Is Still Useful: A Vest-Pocket Method for Enhancing Low-Resource Machine Translation
Minhan Xu, Yu Hong

TL;DR
This paper introduces a simple yet effective sub-word alignment method that enhances low-resource machine translation by leveraging embedding duplication, resulting in significant BLEU score improvements and reduced training time.
Contribution
It extends parent-child transfer learning with a novel sub-word alignment technique, achieving better performance and efficiency in low-resource translation tasks.
Findings
Achieved BLEU scores of 22.5, 28.0, and 18.1 on My-En, Id-En, and Tr-En.
Reduced training time by 63.8%, completing in 1.6 hours on a Tesla P100 GPU.
Method is computationally efficient and publicly available.
Abstract
We leverage embedding duplication between aligned sub-words to extend the Parent-Child transfer learning method, so as to improve low-resource machine translation. We conduct experiments on benchmark datasets of My-En, Id-En and Tr-En translation scenarios. The test results show that our method produces substantial improvements, achieving the BLEU scores of 22.5, 28.0 and 18.1 respectively. In addition, the method is computationally efficient which reduces the consumption of training time by 63.8%, reaching the duration of 1.6 hours when training on a Tesla 16GB P100 GPU. All the models and source codes in the experiments will be made publicly available to support reproducible research.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis
