A Deep Reinforced Model for Abstractive Summarization

Romain Paulus; Caiming Xiong; Richard Socher

arXiv:1705.04304·cs.CL·November 15, 2017·1.3k cites

A Deep Reinforced Model for Abstractive Summarization

Romain Paulus, Caiming Xiong, Richard Socher

PDF

Open Access 5 Repos

TL;DR

This paper introduces a neural network model with intra-attention and a combined training approach using supervised learning and reinforcement learning, significantly improving abstractive summarization quality for longer documents.

Contribution

The paper presents a novel intra-attention mechanism and a hybrid training method that enhances the coherence and readability of summaries in abstractive summarization models.

Findings

01

Achieved 41.16 ROUGE-1 score on CNN/Daily Mail dataset.

02

Model outperforms previous state-of-the-art in automatic metrics.

03

Human evaluation confirms higher quality summaries.

Abstract

Attentional, RNN-based encoder-decoder models for abstractive summarization have achieved good performance on short input and output sequences. For longer documents and summaries however these models often include repetitive and incoherent phrases. We introduce a neural network model with a novel intra-attention that attends over the input and continuously generated output separately, and a new training method that combines standard supervised word prediction and reinforcement learning (RL). Models trained only with supervised learning often exhibit "exposure bias" - they assume ground truth is provided at each step during training. However, when standard word prediction is combined with the global sequence prediction training of RL the resulting summaries become more readable. We evaluate this model on the CNN/Daily Mail and New York Times datasets. Our model obtains a 41.16 ROUGE-1…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques