Reward Augmented Maximum Likelihood for Neural Structured Prediction
Mohammad Norouzi, Samy Bengio, Zhifeng Chen, Navdeep Jaitly, Mike, Schuster, Yonghui Wu, Dale Schuurmans

TL;DR
This paper introduces Reward Augmented Maximum Likelihood (RAML), a method that directly incorporates task-specific reward functions into neural structured prediction models, improving performance over traditional maximum likelihood approaches.
Contribution
The paper proposes a novel framework linking log-likelihood with expected reward, enabling direct reward optimization in neural structured prediction models.
Findings
RAML improves speech recognition accuracy.
RAML enhances machine translation quality.
Reward-based training outperforms MLE baseline.
Abstract
A key problem in structured output prediction is direct optimization of the task reward function that matters for test evaluation. This paper presents a simple and computationally efficient approach to incorporate task reward into a maximum likelihood framework. By establishing a link between the log-likelihood and expected reward objectives, we show that an optimal regularized expected reward is achieved when the conditional distribution of the outputs given the inputs is proportional to their exponentiated scaled rewards. Accordingly, we present a framework to smooth the predictive probability of the outputs using their corresponding rewards. We optimize the conditional log-probability of augmented outputs that are sampled proportionally to their exponentiated scaled rewards. Experiments on neural sequence to sequence models for speech recognition and machine translation show notable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
