Revisiting the Weaknesses of Reinforcement Learning for Neural Machine   Translation

Samuel Kiegeland; Julia Kreutzer

arXiv:2106.08942·cs.CL·June 17, 2021

Revisiting the Weaknesses of Reinforcement Learning for Neural Machine Translation

Samuel Kiegeland, Julia Kreutzer

PDF

1 Repo

TL;DR

This paper critically examines the weaknesses of reinforcement learning in neural machine translation, providing empirical evidence that challenges previous criticisms and highlights the importance of exploration and reward scaling.

Contribution

It offers a comprehensive empirical study that revisits prior claims about RL weaknesses in NMT, emphasizing the roles of exploration and reward scaling.

Findings

01

Exploration and reward scaling are crucial for RL success in NMT.

02

Empirical evidence counters previous criticisms of RL weaknesses.

03

RL can be effective in both in-domain and cross-domain NMT tasks.

Abstract

Policy gradient algorithms have found wide adoption in NLP, but have recently become subject to criticism, doubting their suitability for NMT. Choshen et al. (2020) identify multiple weaknesses and suspect that their success is determined by the shape of output distributions rather than the reward. In this paper, we revisit these claims and study them under a wider range of configurations. Our experiments on in-domain and cross-domain adaptation reveal the importance of exploration and reward scaling, and provide empirical counter-evidence to these claims.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

samuki/reinforce-joey
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.