Moment Matching Training for Neural Machine Translation: A Preliminary Study
Cong Duy Vu Hoang, Ioan Calapodescu, Marc Dymetman

TL;DR
This paper introduces a moment matching training framework for neural machine translation that combines standard training with a method to align feature expectations, showing promising initial results.
Contribution
It proposes a novel moment matching approach to incorporate prior knowledge into neural machine translation training, differing from reinforcement learning methods.
Findings
Unbiased estimates of stochastic gradients derived
Framework effectively aligns feature expectations
Initial results show promising improvements
Abstract
In previous works, neural sequence models have been shown to improve significantly if external prior knowledge can be provided, for instance by allowing the model to access the embeddings of explicit features during both training and inference. In this work, we propose a different point of view on how to incorporate prior knowledge in a principled way, using a moment matching framework. In this approach, the standard local cross-entropy training of the sequential model is combined with a moment matching training mode that encourages the equality of the expectations of certain predefined features between the model distribution and the empirical distribution. In particular, we show how to derive unbiased estimates of some stochastic gradients that are central to the training, and compare our framework with a formally related one: policy gradient training in reinforcement learning,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Domain Adaptation and Few-Shot Learning
