Generating Self-Contained and Summary-Centric Question Answer Pairs via   Differentiable Reward Imitation Learning

Li Zhou; Kevin Small; Yong Zhang; Sandeep Atluri

arXiv:2109.04689·cs.CL·September 13, 2021

Generating Self-Contained and Summary-Centric Question Answer Pairs via Differentiable Reward Imitation Learning

Li Zhou, Kevin Small, Yong Zhang, Sandeep Atluri

PDF

Open Access 1 Repo 1 Models 4 Datasets

TL;DR

This paper introduces a model for generating self-contained, summary-focused question-answer pairs from news articles, using a new dataset and reinforcement learning to improve relevance and brevity.

Contribution

The work presents a novel dataset of news articles with question-answer pairs and a reinforcement learning approach to generate concise, informative QA pairs that capture article gist.

Findings

01

QA pairs effectively summarize articles' main points

02

Reinforcement learning improves answer relevance and brevity

03

High answer accuracy demonstrated through evaluations

Abstract

Motivated by suggested question generation in conversational news recommendation systems, we propose a model for generating question-answer pairs (QA pairs) with self-contained, summary-centric questions and length-constrained, article-summarizing answers. We begin by collecting a new dataset of news articles with questions as titles and pairing them with summaries of varying length. This dataset is used to learn a QA pair generation model producing summaries as answers that balance brevity with sufficiency jointly with their corresponding questions. We then reinforce the QA pair generation process with a differentiable reward function to mitigate exposure bias, a common problem in natural language generation. Both automatic metrics and human evaluation demonstrate these QA pairs successfully capture the central gists of the articles and achieve high answer accuracy.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

amazon-research/sc2qa-dril
pytorchOfficial

Models

🤗
sc2qa/msmarco_qa_classifier
model· 8 dl
8 dl

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications