Topic Segmentation Using Generative Language Models

Pierre Mackenzie; Maya Shah; Patrick Frenett

arXiv:2601.03276·cs.CL·January 8, 2026

Topic Segmentation Using Generative Language Models

Pierre Mackenzie, Maya Shah, Patrick Frenett

PDF

Open Access

TL;DR

This paper explores using generative large language models for topic segmentation, proposing a recursive prompting strategy and boundary similarity metrics, showing they can outperform existing methods but still face challenges.

Contribution

It introduces a novel overlapping and recursive prompting approach for topic segmentation with LLMs and evaluates their effectiveness using boundary similarity metrics.

Findings

01

LLMs can outperform traditional segmentation methods

02

The proposed prompting strategy improves segmentation quality

03

Challenges remain in reliability and consistency of LLM-based segmentation

Abstract

Topic segmentation using generative Large Language Models (LLMs) remains relatively unexplored. Previous methods use semantic similarity between sentences, but such models lack the long range dependencies and vast knowledge found in LLMs. In this work, we propose an overlapping and recursive prompting strategy using sentence enumeration. We also support the adoption of the boundary similarity evaluation metric. Results show that LLMs can be more effective segmenters than existing methods, but issues remain to be solved before they can be relied upon for topic segmentation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Computational and Text Analysis Methods · Sentiment Analysis and Opinion Mining