Learning to Summarize Long Texts with Memory Compression and Transfer

Jaehong Park; Jonathan Pilault; Christopher Pal

arXiv:2010.11322·cs.CL·October 23, 2020

Learning to Summarize Long Texts with Memory Compression and Transfer

Jaehong Park, Jonathan Pilault, Christopher Pal

PDF

Open Access

TL;DR

This paper presents Mem2Mem, a memory compression and transfer mechanism for hierarchical neural networks that improves abstractive document summarization by implicitly extracting salient information without labeled data, achieving competitive results with fewer parameters.

Contribution

Introduces Mem2Mem, a novel memory-to-memory mechanism that enhances hierarchical neural summarization models through implicit extraction and efficient memory compression.

Findings

01

Achieves competitive summarization performance with fewer parameters.

02

Enables implicit extraction without reliance on labeled data.

03

Improves focus on salient information during decoding.

Abstract

We introduce Mem2Mem, a memory-to-memory mechanism for hierarchical recurrent neural network based encoder decoder architectures and we explore its use for abstractive document summarization. Mem2Mem transfers "memories" via readable/writable external memory modules that augment both the encoder and decoder. Our memory regularization compresses an encoded input article into a more compact set of sentence representations. Most importantly, the memory compression step performs implicit extraction without labels, sidestepping issues with suboptimal ground-truth data and exposure bias of hybrid extractive-abstractive summarization techniques. By allowing the decoder to read/write over the encoded input memory, the model learns to read salient information about the input article while keeping track of what has been generated. Our Mem2Mem approach yields results that are competitive with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications