Long Document Summarization with Top-down and Bottom-up Inference
Bo Pang, Erik Nijkamp, Wojciech Kry\'sci\'nski, Silvio Savarese,, Yingbo Zhou, Caiming Xiong

TL;DR
This paper introduces a hierarchical inference framework for long document summarization that combines top-down and bottom-up processes, improving efficiency and performance across diverse datasets.
Contribution
The paper proposes a novel hierarchical inference framework that enables bidirectional token representation updates, enhancing long document summarization beyond existing transformer models.
Findings
Achieves state-of-the-art results on long document benchmarks.
Demonstrates high efficiency with significantly fewer parameters.
Effectively summarizes entire books with minimal resources.
Abstract
Text summarization aims to condense long documents and retain key information. Critical to the success of a summarization model is the faithful inference of latent representations of words or tokens in the source documents. Most recent models infer the latent representations with a transformer encoder, which is purely bottom-up. Also, self-attention-based inference models face the challenge of quadratic complexity with respect to sequence length. We propose a principled inference framework to improve summarization models on these two aspects. Our framework assumes a hierarchical latent structure of a document where the top-level captures the long range dependency at a coarser time scale and the bottom token level preserves the details. Critically, this hierarchical structure enables token representations to be updated in both a bottom-up and top-down manner. In the bottom-up pass, token…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
