Auto-Encoding Variational Neural Machine Translation

Bryan Eikema; Wilker Aziz

arXiv:1807.10564·cs.CL·June 3, 2019

Auto-Encoding Variational Neural Machine Translation

Bryan Eikema, Wilker Aziz

PDF

1 Repo

TL;DR

This paper introduces a deep generative model for bilingual sentence pairs that jointly models source and target sentences, improving translation performance across various training scenarios using neural networks and variational inference.

Contribution

It presents a novel joint modeling approach for neural machine translation with efficient training and decoding methods, outperforming standard conditional models.

Findings

01

Joint model outperforms conditional models in all tested scenarios

02

Effective training with amortised variational inference

03

Fast approximate decoding method demonstrated

Abstract

We present a deep generative model of bilingual sentence pairs for machine translation. The model generates source and target sentences jointly from a shared latent representation and is parameterised by neural networks. We perform efficient training using amortised variational inference and reparameterised gradients. Additionally, we discuss the statistical implications of joint modelling and propose an efficient approximation to maximum a posteriori decoding for fast test-time predictions. We demonstrate the effectiveness of our model in three machine translation scenarios: in-domain training, mixed-domain training, and learning from a mix of gold-standard and synthetic data. Our experiments show consistently that our joint formulation outperforms conditional modelling (i.e. standard neural machine translation) in all such scenarios.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Roxot/AEVNMT
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.