Generating Wikipedia by Summarizing Long Sequences

Peter J. Liu; Mohammad Saleh; Etienne Pot; Ben Goodrich; Ryan Sepassi,; Lukasz Kaiser; Noam Shazeer

arXiv:1801.10198·cs.CL·February 1, 2018·75 cites

Generating Wikipedia by Summarizing Long Sequences

Peter J. Liu, Mohammad Saleh, Etienne Pot, Ben Goodrich, Ryan Sepassi,, Lukasz Kaiser, Noam Shazeer

PDF

Open Access 4 Repos 1 Datasets

TL;DR

This paper presents a scalable decoder-only neural model for generating Wikipedia articles by summarizing long sequences of source documents, combining extractive and abstractive methods to produce coherent, factual content.

Contribution

It introduces a novel decoder-only architecture capable of attending to very long sequences for multi-document summarization and Wikipedia article generation.

Findings

01

The model generates fluent, coherent multi-sentence paragraphs.

02

It effectively extracts relevant factual information from source documents.

03

Performance is validated through perplexity, ROUGE scores, and human evaluations.

Abstract

We show that generating English Wikipedia articles can be approached as a multi- document summarization of source documents. We use extractive summarization to coarsely identify salient information and a neural abstractive model to generate the article. For the abstractive model, we introduce a decoder-only architecture that can scalably attend to very long sequences, much longer than typical encoder- decoder architectures used in sequence transduction. We show that this model can generate fluent, coherent multi-sentence paragraphs and even whole Wikipedia articles. When given reference documents, we show it can extract relevant factual information as reflected in perplexity, ROUGE scores and human evaluations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Datasets

GEM/wiki_cat_sum
dataset· 659 dl
659 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Wikis in Education and Collaboration

MethodsTransformer Decoder