Efficient Adaptation of Pretrained Transformers for Abstractive   Summarization

Andrew Hoang; Antoine Bosselut; Asli Celikyilmaz; Yejin Choi

arXiv:1906.00138·cs.CL·June 4, 2019·39 cites

Efficient Adaptation of Pretrained Transformers for Abstractive Summarization

Andrew Hoang, Antoine Bosselut, Asli Celikyilmaz, Yejin Choi

PDF

Open Access 2 Repos

TL;DR

This paper introduces two methods for efficiently adapting pretrained transformer models to abstractive summarization, achieving state-of-the-art results and producing more focused summaries, especially on highly abstractive datasets.

Contribution

The paper proposes source embeddings and domain-adaptive training techniques for transformer adaptation in summarization, demonstrating their effectiveness across multiple datasets.

Findings

01

Achieved new state-of-the-art on two datasets.

02

Produced more focused and less superfluous summaries.

03

Improvements are more significant on highly abstractive datasets.

Abstract

Large-scale learning of transformer language models has yielded improvements on a variety of natural language understanding tasks. Whether they can be effectively adapted for summarization, however, has been less explored, as the learned representations are less seamlessly integrated into existing neural text production architectures. In this work, we propose two solutions for efficiently adapting pretrained transformer language models as text summarizers: source embeddings and domain-adaptive training. We test these solutions on three abstractive summarization datasets, achieving new state of the art performance on two of them. Finally, we show that these improvements are achieved by producing more focused summaries with fewer superfluous and that performance improvements are more pronounced on more abstractive datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques

MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Byte Pair Encoding · Dense Connections · Label Smoothing · *Communicated@Fast*How Do I Communicate to Expedia? · Adam · Softmax