Target Conditioning for One-to-Many Generation
Marie-Anne Lachaux, Armand Joulin, Guillaume Lample

TL;DR
This paper introduces a method to improve diversity in neural machine translation by conditioning the decoder on a learned latent domain variable, enabling scalable and diverse translation outputs.
Contribution
It proposes a target domain conditioning approach with a jointly trained target encoder, allowing scalable diversity without impacting performance or training time.
Findings
Enhanced translation diversity demonstrated across multiple datasets.
Scalable to any number of target domains without performance loss.
Outperforms previous methods in diversity metrics.
Abstract
Neural Machine Translation (NMT) models often lack diversity in their generated translations, even when paired with search algorithm, like beam search. A challenge is that the diversity in translations are caused by the variability in the target language, and cannot be inferred from the source sentence alone. In this paper, we propose to explicitly model this one-to-many mapping by conditioning the decoder of a NMT model on a latent variable that represents the domain of target sentences. The domain is a discrete variable generated by a target encoder that is jointly trained with the NMT model. The predicted domain of target sentences are given as input to the decoder during training. At inference, we can generate diverse translations by decoding with different domains. Unlike our strongest baseline (Shen et al., 2019), our method can scale to any number of domains without affecting the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
