End-to-End Training for Back-Translation with Categorical   Reparameterization Trick

DongNyeong Heo; Heeyoul Choi

arXiv:2202.08465·cs.CL·July 2, 2024

End-to-End Training for Back-Translation with Categorical Reparameterization Trick

DongNyeong Heo, Heeyoul Choi

PDF

Open Access 1 Repo

TL;DR

This paper introduces the categorical reparameterization trick (CRT) to enable end-to-end training of back-translation in neural machine translation, improving BLEU scores over previous methods.

Contribution

The paper proposes CRT, a novel differentiable sampling method for NMT, allowing end-to-end back-translation training with variational auto-encoder frameworks.

Findings

01

CRT outperforms Gumbel-softmax in experiments

02

End-to-end training improves BLEU scores

03

Effective across multiple WMT datasets

Abstract

Back-translation (BT) is an effective semi-supervised learning framework in neural machine translation (NMT). A pre-trained NMT model translates monolingual sentences and makes synthetic bilingual sentence pairs for the training of the other NMT model, and vice versa. Understanding the two NMT models as inference and generation models, respectively, the training method of variational auto-encoder (VAE) was applied in previous works, which is a mainstream framework of generative models. However, the discrete property of translated sentences prevents gradient information from flowing between the two NMT models. In this paper, we propose the categorical reparameterization trick (CRT) that makes NMT models generate differentiable sentences so that the VAE's training framework can work in an end-to-end fashion. Our BT experiment conducted on a WMT benchmark dataset demonstrates the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nunpuking/end-to-end-backtranslation
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications