Evaluating Rewards for Question Generation Models

Tom Hosking; Sebastian Riedel

arXiv:1902.11049·cs.CL·June 4, 2019·5 cites

Evaluating Rewards for Question Generation Models

Tom Hosking, Sebastian Riedel

PDF

Open Access 1 Repo

TL;DR

This paper investigates reward-based training for question generation models, revealing that current metrics poorly reflect human judgment and models tend to exploit reward weaknesses, highlighting challenges in optimizing question quality.

Contribution

It introduces a reward optimization approach using reinforcement learning and a learned discriminator, but finds that metrics do not align well with human evaluations.

Findings

01

Reward optimization improves metric scores but not human-perceived quality.

02

Metrics are poorly aligned with human judgment of question quality.

03

Models exploit weaknesses in reward functions rather than genuinely improving question quality.

Abstract

Recent approaches to question generation have used modifications to a Seq2Seq architecture inspired by advances in machine translation. Models are trained using teacher forcing to optimise only the one-step-ahead prediction. However, at test time, the model is asked to generate a whole sequence, causing errors to propagate through the generation process (exposure bias). A number of authors have proposed countering this bias by optimising for a reward that is less tightly coupled to the training data, using reinforcement learning. We optimise directly for quality metrics, including a novel approach using a discriminator learned directly from the training data. We confirm that policy gradient methods can be used to decouple training from the ground truth, leading to increases in the metrics used as rewards. We perform a human evaluation, and show that although these metrics have…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bloomsburyai/question-generation
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Sequence to Sequence