Document Summarization with Text Segmentation
Lesly Miculicich, Benjamin Han

TL;DR
This paper demonstrates that leveraging text segmentation models enhances extractive summarization, especially for documents where key information is not at the start, by reducing lead bias.
Contribution
It introduces and evaluates text segmentation models to improve extractive summarization, showing significant benefits on scientific articles.
Findings
Segmentation improves summarization quality.
Most gains occur when relevant info is not at the beginning.
Segmentation reduces lead bias in summarization.
Abstract
In this paper, we exploit the innate document segment structure for improving the extractive summarization task. We build two text segmentation models and find the most optimal strategy to introduce their output predictions in an extractive summarization model. Experimental results on a corpus of scientific articles show that extractive summarization benefits from using a highly accurate segmentation method. In particular, most of the improvement is in documents where the most relevant information is not at the beginning thus, we conclude that segmentation helps in reducing the lead bias problem.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
