Transfer Reward Learning for Policy Gradient-Based Text Generation

James O' Neill; Danushka Bollegala

arXiv:1909.03622·cs.LG·September 10, 2019·1 cites

Transfer Reward Learning for Policy Gradient-Based Text Generation

James O' Neill, Danushka Bollegala

PDF

Open Access

TL;DR

This paper introduces a transfer learning approach for reward models in policy gradient-based text generation, improving semantic evaluation metrics in image captioning tasks.

Contribution

It proposes a transferable reward learner that enhances policy gradient models by using model-based rewards from sentence similarity tasks, outperforming n-gram overlap measures.

Findings

01

Improved semantic similarity scores on MSCOCO dataset.

02

Enhanced performance on Flickr-30k dataset.

03

Demonstrated general applicability of transfer learning in reward models.

Abstract

Task-specific scores are often used to optimize for and evaluate the performance of conditional text generation systems. However, such scores are non-differentiable and cannot be used in the standard supervised learning paradigm. Hence, policy gradient methods are used since the gradient can be computed without requiring a differentiable objective. However, we argue that current n-gram overlap based measures that are used as rewards can be improved by using model-based rewards transferred from tasks that directly compare the similarity of sentence pairs. These reward models either output a score of sentence-level syntactic and semantic similarity between entire predicted and target sentences as the expected return, or for intermediate phrases as segmented accumulative rewards. We demonstrate that using a \textit{Transferable Reward Learner} leads to improved results on semantical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Natural Language Processing Techniques