Document Summarization with Text Segmentation

Lesly Miculicich; Benjamin Han

arXiv:2301.08817·cs.CL·January 24, 2023·1 cites

Document Summarization with Text Segmentation

Lesly Miculicich, Benjamin Han

PDF

Open Access

TL;DR

This paper demonstrates that leveraging text segmentation models enhances extractive summarization, especially for documents where key information is not at the start, by reducing lead bias.

Contribution

It introduces and evaluates text segmentation models to improve extractive summarization, showing significant benefits on scientific articles.

Findings

01

Segmentation improves summarization quality.

02

Most gains occur when relevant info is not at the beginning.

03

Segmentation reduces lead bias in summarization.

Abstract

In this paper, we exploit the innate document segment structure for improving the extractive summarization task. We build two text segmentation models and find the most optimal strategy to introduce their output predictions in an extractive summarization model. Experimental results on a corpus of scientific articles show that extractive summarization benefits from using a highly accurate segmentation method. In particular, most of the improvement is in documents where the most relevant information is not at the beginning thus, we conclude that segmentation helps in reducing the lead bias problem.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques