Reciprocal Supervised Learning Improves Neural Machine Translation

Minkai Xu; Mingxuan Wang; Zhouhan Lin; Hao Zhou; Weinan Zhang; Lei Li

arXiv:2012.02975·cs.CL·December 8, 2020

Reciprocal Supervised Learning Improves Neural Machine Translation

Minkai Xu, Mingxuan Wang, Zhouhan Lin, Hao Zhou, Weinan Zhang, Lei Li

PDF

Open Access 1 Repo

TL;DR

Reciprocal Supervised Learning (RSL) enhances neural machine translation by collaboratively training multiple models to generate and utilize pseudo data, leveraging their diverse biases for improved accuracy and efficiency.

Contribution

This paper introduces RSL, a novel cooperative training method that improves NMT by jointly exploiting multiple models' agreement, surpassing previous knowledge distillation approaches.

Findings

01

RSL significantly improves translation accuracy on multiple benchmarks.

02

It outperforms traditional knowledge distillation and ensemble methods.

03

RSL is more computationally efficient than ensemble approaches.

Abstract

Despite the recent success on image classification, self-training has only achieved limited gains on structured prediction tasks such as neural machine translation (NMT). This is mainly due to the compositionality of the target space, where the far-away prediction hypotheses lead to the notorious reinforced mistake problem. In this paper, we revisit the utilization of multiple diverse models and present a simple yet effective approach named Reciprocal-Supervised Learning (RSL). RSL first exploits individual models to generate pseudo parallel data, and then cooperatively trains each model on the combined synthetic corpus. RSL leverages the fact that different parameterized models have different inductive biases, and better predictions can be made by jointly exploiting the agreement among each other. Unlike the previous knowledge distillation methods built upon a much stronger teacher,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

MinkaiXu/RSL-NMT
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Multimodal Machine Learning Applications

MethodsKnowledge Distillation