Advances in domain independent linear text segmentation

Freddy Y. Y. Choi (University of Manchester)

arXiv:cs/0003083·cs.CL·May 23, 2007·576 cites

Advances in domain independent linear text segmentation

Freddy Y. Y. Choi (University of Manchester)

PDF

Open Access

TL;DR

This paper introduces a new linear text segmentation method that significantly improves accuracy and speed by using rank-based similarity and divisive clustering for boundary detection.

Contribution

The paper presents a novel approach to text segmentation that outperforms previous methods in both accuracy and computational efficiency.

Findings

01

Twice as accurate as previous methods

02

Over seven times faster in processing

03

Effective boundary detection via divisive clustering

Abstract

This paper describes a method for linear text segmentation which is twice as accurate and over seven times as fast as the state-of-the-art (Reynar, 1998). Inter-sentence similarity is replaced by rank in the local context. Boundary locations are discovered by divisive clustering.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems