Conditional Variational Autoencoder for Neural Machine Translation

Artidoro Pagnoni; Kevin Liu; Shangyan Li

arXiv:1812.04405·cs.CL·December 12, 2018·42 cites

Conditional Variational Autoencoder for Neural Machine Translation

Artidoro Pagnoni, Kevin Liu, Shangyan Li

PDF

Open Access

TL;DR

This paper introduces a conditional variational autoencoder for neural machine translation that enhances translation quality by effectively modeling latent features, addressing challenges like posterior collapse, and exploring the latent space.

Contribution

It presents the first conditional variational model for text that effectively utilizes latent variables without compromising translation performance.

Findings

01

Improved translation performance over baseline models.

02

Mitigation strategies for posterior collapse in text latent variable models.

03

Meaningful capture of translation features in the latent space.

Abstract

We explore the performance of latent variable models for conditional text generation in the context of neural machine translation (NMT). Similar to Zhang et al., we augment the encoder-decoder NMT paradigm by introducing a continuous latent variable to model features of the translation process. We extend this model with a co-attention mechanism motivated by Parikh et al. in the inference network. Compared to the vision domain, latent variable models for text face additional challenges due to the discrete nature of language, namely posterior collapse. We experiment with different approaches to mitigate this issue. We show that our conditional variational model improves upon both discriminative attention-based translation and the variational baseline presented in Zhang et al. Finally, we present some exploration of the learned latent space to illustrate what the latent variable is capable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis