Moment Matching Training for Neural Machine Translation: A Preliminary   Study

Cong Duy Vu Hoang; Ioan Calapodescu; Marc Dymetman

arXiv:1812.09836·cs.CL·December 31, 2018·1 cites

Moment Matching Training for Neural Machine Translation: A Preliminary Study

Cong Duy Vu Hoang, Ioan Calapodescu, Marc Dymetman

PDF

Open Access

TL;DR

This paper introduces a moment matching training framework for neural machine translation that combines standard training with a method to align feature expectations, showing promising initial results.

Contribution

It proposes a novel moment matching approach to incorporate prior knowledge into neural machine translation training, differing from reinforcement learning methods.

Findings

01

Unbiased estimates of stochastic gradients derived

02

Framework effectively aligns feature expectations

03

Initial results show promising improvements

Abstract

In previous works, neural sequence models have been shown to improve significantly if external prior knowledge can be provided, for instance by allowing the model to access the embeddings of explicit features during both training and inference. In this work, we propose a different point of view on how to incorporate prior knowledge in a principled way, using a moment matching framework. In this approach, the standard local cross-entropy training of the sequential model is combined with a moment matching training mode that encourages the equality of the expectations of certain predefined features between the model distribution and the empirical distribution. In particular, we show how to derive unbiased estimates of some stochastic gradients that are central to the training, and compare our framework with a formally related one: policy gradient training in reinforcement learning,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning