End-to-End Training for Back-Translation with Categorical Reparameterization Trick
DongNyeong Heo, Heeyoul Choi

TL;DR
This paper introduces the categorical reparameterization trick (CRT) to enable end-to-end training of back-translation in neural machine translation, improving BLEU scores over previous methods.
Contribution
The paper proposes CRT, a novel differentiable sampling method for NMT, allowing end-to-end back-translation training with variational auto-encoder frameworks.
Findings
CRT outperforms Gumbel-softmax in experiments
End-to-end training improves BLEU scores
Effective across multiple WMT datasets
Abstract
Back-translation (BT) is an effective semi-supervised learning framework in neural machine translation (NMT). A pre-trained NMT model translates monolingual sentences and makes synthetic bilingual sentence pairs for the training of the other NMT model, and vice versa. Understanding the two NMT models as inference and generation models, respectively, the training method of variational auto-encoder (VAE) was applied in previous works, which is a mainstream framework of generative models. However, the discrete property of translated sentences prevents gradient information from flowing between the two NMT models. In this paper, we propose the categorical reparameterization trick (CRT) that makes NMT models generate differentiable sentences so that the VAE's training framework can work in an end-to-end fashion. Our BT experiment conducted on a WMT benchmark dataset demonstrates the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
