Learning to Summarize Long Texts with Memory Compression and Transfer
Jaehong Park, Jonathan Pilault, Christopher Pal

TL;DR
This paper presents Mem2Mem, a memory compression and transfer mechanism for hierarchical neural networks that improves abstractive document summarization by implicitly extracting salient information without labeled data, achieving competitive results with fewer parameters.
Contribution
Introduces Mem2Mem, a novel memory-to-memory mechanism that enhances hierarchical neural summarization models through implicit extraction and efficient memory compression.
Findings
Achieves competitive summarization performance with fewer parameters.
Enables implicit extraction without reliance on labeled data.
Improves focus on salient information during decoding.
Abstract
We introduce Mem2Mem, a memory-to-memory mechanism for hierarchical recurrent neural network based encoder decoder architectures and we explore its use for abstractive document summarization. Mem2Mem transfers "memories" via readable/writable external memory modules that augment both the encoder and decoder. Our memory regularization compresses an encoded input article into a more compact set of sentence representations. Most importantly, the memory compression step performs implicit extraction without labels, sidestepping issues with suboptimal ground-truth data and exposure bias of hybrid extractive-abstractive summarization techniques. By allowing the decoder to read/write over the encoded input memory, the model learns to read salient information about the input article while keeping track of what has been generated. Our Mem2Mem approach yields results that are competitive with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
