Conditional Variational Autoencoder for Neural Machine Translation
Artidoro Pagnoni, Kevin Liu, Shangyan Li

TL;DR
This paper introduces a conditional variational autoencoder for neural machine translation that enhances translation quality by effectively modeling latent features, addressing challenges like posterior collapse, and exploring the latent space.
Contribution
It presents the first conditional variational model for text that effectively utilizes latent variables without compromising translation performance.
Findings
Improved translation performance over baseline models.
Mitigation strategies for posterior collapse in text latent variable models.
Meaningful capture of translation features in the latent space.
Abstract
We explore the performance of latent variable models for conditional text generation in the context of neural machine translation (NMT). Similar to Zhang et al., we augment the encoder-decoder NMT paradigm by introducing a continuous latent variable to model features of the translation process. We extend this model with a co-attention mechanism motivated by Parikh et al. in the inference network. Compared to the vision domain, latent variable models for text face additional challenges due to the discrete nature of language, namely posterior collapse. We experiment with different approaches to mitigate this issue. We show that our conditional variational model improves upon both discriminative attention-based translation and the variational baseline presented in Zhang et al. Finally, we present some exploration of the learned latent space to illustrate what the latent variable is capable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis
