Machine Translation between Vietnamese and English: an Empirical Study
Hong-Hai Phan-Vu, Viet-Trung Tran, Van-Nam Nguyen, Hoang-Vu Dang,, Phan-Thuan Do

TL;DR
This paper presents efforts to improve English-Vietnamese machine translation by building a large corpus and experimenting with neural models to enhance translation quality in a low-resource setting.
Contribution
It introduces the largest open Vietnamese-English corpus and evaluates neural translation models to address low-resource challenges.
Findings
Largest Vietnamese-English corpus created to date
Neural models can be effectively employed for low-resource translation
Achieved high BLEU scores with extensive experiments
Abstract
Machine translation is shifting to an end-to-end approach based on deep neural networks. The state of the art achieves impressive results for popular language pairs such as English - French or English - Chinese. However for English - Vietnamese the shortage of parallel corpora and expensive hyper-parameter search present practical challenges to neural-based approaches. This paper highlights our efforts on improving English-Vietnamese translations in two directions: (1) Building the largest open Vietnamese - English corpus to date, and (2) Extensive experiments with the latest neural models to achieve the highest BLEU scores. Our experiments provide practical examples of effectively employing different neural machine translation models with low-resource language pairs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
