A Neural Attention Model for Abstractive Sentence Summarization

Alexander M. Rush; Sumit Chopra; Jason Weston

arXiv:1509.00685·cs.CL·September 4, 2015

A Neural Attention Model for Abstractive Sentence Summarization

Alexander M. Rush, Sumit Chopra, Jason Weston

PDF

4 Repos

TL;DR

This paper introduces a neural attention-based model for abstractive sentence summarization that generates summaries conditioned on input sentences, demonstrating significant improvements over previous methods on benchmark datasets.

Contribution

It presents a simple, end-to-end trainable local attention model for abstractive summarization, advancing the state-of-the-art performance.

Findings

01

Significant performance gains on DUC-2004 dataset

02

Model is simple and scalable to large data

03

End-to-end training capability

Abstract

Summarization based on text extraction is inherently limited, but generation-style abstractive methods have proven challenging to build. In this work, we propose a fully data-driven approach to abstractive sentence summarization. Our method utilizes a local attention-based model that generates each word of the summary conditioned on the input sentence. While the model is structurally simple, it can easily be trained end-to-end and scales to a large amount of training data. The model shows significant performance gains on the DUC-2004 shared task compared with several strong baselines.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.