Generating Diverse Translation from Model Distribution with Dropout

Xuanfu Wu; Yang Feng; Chenze Shao

arXiv:2010.08178·cs.CL·October 19, 2020

Generating Diverse Translation from Model Distribution with Dropout

Xuanfu Wu, Yang Feng, Chenze Shao

PDF

Open Access

TL;DR

This paper introduces a Bayesian dropout-based method for neural machine translation that generates diverse translations by sampling from a distribution of models, improving diversity without sacrificing accuracy.

Contribution

It proposes a novel approach using concrete dropout and variational inference to derive multiple models for diverse translation generation in NMT.

Findings

01

Enhanced diversity in translation outputs

02

Maintained or improved translation accuracy

03

Effective trade-off between diversity and quality

Abstract

Despite the improvement of translation quality, neural machine translation (NMT) often suffers from the lack of diversity in its generation. In this paper, we propose to generate diverse translations by deriving a large number of possible models with Bayesian modelling and sampling models from them for inference. The possible models are obtained by applying concrete dropout to the NMT model and each of them has specific confidence for its prediction, which corresponds to a posterior model distribution under specific training data in the principle of Bayesian modeling. With variational inference, the posterior model distribution can be approximated with a variational distribution, from which the final models for inference are sampled. We conducted experiments on Chinese-English and English-German translation tasks and the results shows that our method makes a better trade-off between…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications

MethodsConcrete Dropout · Dropout