The UMD Neural Machine Translation Systems at WMT17 Bandit Learning Task
Amr Sharaf, Shi Feng, Khanh Nguyen, Kiant\'e Brantley, Hal Daum\'e III

TL;DR
This paper presents the University of Maryland's neural machine translation systems designed for the WMT17 Bandit Learning Task, focusing on domain adaptation and learning from limited feedback.
Contribution
It introduces robust reinforcement learning methods and data selection techniques for effective domain adaptation in bandit feedback scenarios.
Findings
Effective adaptation to new domains using bandit feedback
Improved translation quality through reinforcement learning
Successful application of data selection for domain adaptation
Abstract
We describe the University of Maryland machine translation systems submitted to the WMT17 German-English Bandit Learning Task. The task is to adapt a translation system to a new domain, using only bandit feedback: the system receives a German sentence to translate, produces an English sentence, and only gets a scalar score as feedback. Targeting these two challenges (adaptation and bandit learning), we built a standard neural machine translation system and extended it in two ways: (1) robust reinforcement learning techniques to learn effectively from the bandit feedback, and (2) domain adaptation using data selection from a large corpus of parallel data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
