The UMD Neural Machine Translation Systems at WMT17 Bandit Learning Task

Amr Sharaf; Shi Feng; Khanh Nguyen; Kiant\'e Brantley; Hal Daum\'e III

arXiv:1708.01318·cs.CL·August 9, 2017

The UMD Neural Machine Translation Systems at WMT17 Bandit Learning Task

Amr Sharaf, Shi Feng, Khanh Nguyen, Kiant\'e Brantley, Hal Daum\'e III

PDF

TL;DR

This paper presents the University of Maryland's neural machine translation systems designed for the WMT17 Bandit Learning Task, focusing on domain adaptation and learning from limited feedback.

Contribution

It introduces robust reinforcement learning methods and data selection techniques for effective domain adaptation in bandit feedback scenarios.

Findings

01

Effective adaptation to new domains using bandit feedback

02

Improved translation quality through reinforcement learning

03

Successful application of data selection for domain adaptation

Abstract

We describe the University of Maryland machine translation systems submitted to the WMT17 German-English Bandit Learning Task. The task is to adapt a translation system to a new domain, using only bandit feedback: the system receives a German sentence to translate, produces an English sentence, and only gets a scalar score as feedback. Targeting these two challenges (adaptation and bandit learning), we built a standard neural machine translation system and extended it in two ways: (1) robust reinforcement learning techniques to learn effectively from the bandit feedback, and (2) domain adaptation using data selection from a large corpus of parallel data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.