Topic Segmentation Using Generative Language Models
Pierre Mackenzie, Maya Shah, Patrick Frenett

TL;DR
This paper explores using generative large language models for topic segmentation, proposing a recursive prompting strategy and boundary similarity metrics, showing they can outperform existing methods but still face challenges.
Contribution
It introduces a novel overlapping and recursive prompting approach for topic segmentation with LLMs and evaluates their effectiveness using boundary similarity metrics.
Findings
LLMs can outperform traditional segmentation methods
The proposed prompting strategy improves segmentation quality
Challenges remain in reliability and consistency of LLM-based segmentation
Abstract
Topic segmentation using generative Large Language Models (LLMs) remains relatively unexplored. Previous methods use semantic similarity between sentences, but such models lack the long range dependencies and vast knowledge found in LLMs. In this work, we propose an overlapping and recursive prompting strategy using sentence enumeration. We also support the adoption of the boundary similarity evaluation metric. Results show that LLMs can be more effective segmenters than existing methods, but issues remain to be solved before they can be relied upon for topic segmentation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Computational and Text Analysis Methods · Sentiment Analysis and Opinion Mining
