English-Japanese Neural Machine Translation with Encoder-Decoder-Reconstructor
Yukio Matsumura, Takayuki Sato, Mamoru Komachi

TL;DR
This paper enhances neural machine translation for English-Japanese by implementing an encoder-decoder-reconstructor framework, which reduces errors like word repetition and omission, and evaluates the impact of pre-training versus joint training.
Contribution
It demonstrates the effectiveness of the encoder-decoder-reconstructor approach in improving translation quality and compares pre-training with joint training for NMT.
Findings
Re-implementation confirms reduction of word errors in English-Japanese translation.
The approach significantly improves BLEU scores.
Pre-training shows advantages over joint training.
Abstract
Neural machine translation (NMT) has recently become popular in the field of machine translation. However, NMT suffers from the problem of repeating or missing words in the translation. To address this problem, Tu et al. (2017) proposed an encoder-decoder-reconstructor framework for NMT using back-translation. In this method, they selected the best forward translation model in the same manner as Bahdanau et al. (2015), and then trained a bi-directional translation model as fine-tuning. Their experiments show that it offers significant improvement in BLEU scores in Chinese-English translation task. We confirm that our re-implementation also shows the same tendency and alleviates the problem of repeating and missing words in the translation on a English-Japanese task too. In addition, we evaluate the effectiveness of pre-training by comparing it with a jointly-trained model of forward…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
