NILC: Discovering New Intents with LLM-assisted Clustering
Hongtao Wang, Renchi Yang, Wenqing Lin

TL;DR
NILC introduces an iterative LLM-assisted clustering framework for new intent discovery, enhancing semantic understanding and clustering accuracy in dialogue systems through semantic centroid creation and utterance rewriting.
Contribution
The paper presents NILC, a novel LLM-assisted clustering method that iteratively refines intent clusters, addressing limitations of previous cascaded approaches in NID.
Findings
Achieves significant performance improvements over baselines.
Effective in both unsupervised and semi-supervised settings.
Consistently outperforms on six benchmark datasets.
Abstract
New intent discovery (NID) seeks to recognize both new and known intents from unlabeled user utterances, which finds prevalent use in practical dialogue systems. Existing works towards NID mainly adopt a cascaded architecture, wherein the first stage focuses on encoding the utterances into informative text embeddings beforehand, while the latter is to group similar embeddings into clusters (i.e., intents), typically by K-Means. However, such a cascaded pipeline fails to leverage the feedback from both steps for mutual refinement, and, meanwhile, the embedding-only clustering overlooks nuanced textual semantics, leading to suboptimal performance. To bridge this gap, this paper proposes NILC, a novel clustering framework specially catered for effective NID. Particularly, NILC follows an iterative workflow, in which clustering assignments are judiciously updated by carefully refining…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Sentiment Analysis and Opinion Mining
