On Extractive and Abstractive Neural Document Summarization with   Transformer Language Models

Sandeep Subramanian; Raymond Li; Jonathan Pilault; Christopher Pal

arXiv:1909.03186·cs.CL·April 29, 2020·25 cites

On Extractive and Abstractive Neural Document Summarization with Transformer Language Models

Sandeep Subramanian, Raymond Li, Jonathan Pilault, Christopher Pal

PDF

Open Access 1 Repo

TL;DR

This paper introduces a neural summarization method combining extractive and abstractive techniques using transformer models, significantly improving summary quality for long documents.

Contribution

It proposes a simple extractive step to enhance transformer-based abstractive summarization, leading to more abstractive summaries with higher ROUGE scores.

Findings

01

Extractive step improves summarization results

02

Method produces more abstractive summaries than prior work

03

Achieves higher ROUGE scores

Abstract

We present a method to produce abstractive summaries of long documents that exceed several thousand words via neural abstractive summarization. We perform a simple extractive step before generating a summary, which is then used to condition the transformer language model on relevant information before being tasked with generating a summary. We show that this extractive step significantly improves summarization results. We also show that this approach produces more abstractive summaries compared to prior work that employs a copy mechanism while still achieving higher rouge scores. Note: The abstract above was not written by the authors, it was generated by one of the models presented in this paper.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Bread-and-Code/Text-Summarization
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Byte Pair Encoding · Dense Connections · Label Smoothing · *Communicated@Fast*How Do I Communicate to Expedia? · Adam · Softmax