Text Clustering as Classification with LLMs

Chen Huang; Guoxiu He

arXiv:2410.00927·cs.CL·October 8, 2025

Text Clustering as Classification with LLMs

Chen Huang, Guoxiu He

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel LLM-based framework that reformulates text clustering as a classification task, significantly reducing computational costs while maintaining or improving clustering performance across various datasets.

Contribution

The paper presents a new approach that leverages in-context learning of LLMs to perform text clustering without fine-tuning or complex similarity metrics.

Findings

01

Achieves comparable or better performance than state-of-the-art methods.

02

Reduces computational complexity and resource requirements.

03

Demonstrates effectiveness across diverse datasets.

Abstract

Text clustering serves as a fundamental technique for organizing and interpreting unstructured textual data, particularly in contexts where manual annotation is prohibitively costly. With the rapid advancement of Large Language Models (LLMs) and their demonstrated effectiveness across a broad spectrum of NLP tasks, an emerging body of research has begun to explore their potential in the domain of text clustering. However, existing LLM-based approaches still rely on fine-tuned embedding models and sophisticated similarity metrics, rendering them computationally intensive and necessitating domain-specific adaptation. To address these limitations, we propose a novel framework that reframes text clustering as a classification task by harnessing the in-context learning capabilities of LLMs. Our framework eliminates the need for fine-tuning embedding models or intricate clustering algorithms.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ecnu-text-computing/text-clustering-via-llm
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques