Regressing Word and Sentence Embeddings for Regularization of Neural Machine Translation
Inigo Jauregi Unanue, Ehsan Zare Borzeshi, Massimo Piccardi

TL;DR
This paper introduces a novel regularization method for neural machine translation that regresses word and sentence embeddings during training, significantly enhancing translation quality especially in low-resource scenarios.
Contribution
It proposes regressing word and sentence embeddings as a new regularization technique to improve NMT generalization, particularly for limited data settings.
Findings
ReWE and ReSE outperform state-of-the-art baselines in multiple language pairs.
The methods yield up to +5.15 BLEU points improvement in low-resource Basque-English translation.
Regularizers improve word clustering and translation accuracy.
Abstract
In recent years, neural machine translation (NMT) has become the dominant approach in automated translation. However, like many other deep learning approaches, NMT suffers from overfitting when the amount of training data is limited. This is a serious issue for low-resource language pairs and many specialized translation domains that are inherently limited in the amount of available supervised data. For this reason, in this paper we propose regressing word (ReWE) and sentence (ReSE) embeddings at training time as a way to regularize NMT models and improve their generalization. During training, our models are trained to jointly predict categorical (words in the vocabulary) and continuous (word and sentence embeddings) outputs. An extensive set of experiments over four language pairs of variable training set size has showed that ReWE and ReSE can outperform strong state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications
