RoNID: New Intent Discovery with Generated-Reliable Labels and Cluster-friendly Representations
Shun Zhang, Chaoran Yan, Jian Yang, Changyu Ren, Jiaqi Bai, Tongliang, Li, Zhoujun Li

TL;DR
RoNID introduces an EM-style framework for new intent discovery that enhances pseudo-label reliability and representation quality, significantly improving performance on multiple benchmarks.
Contribution
The paper presents a novel EM-style approach combining reliable pseudo-label generation and cluster-friendly representation learning for better intent discovery.
Findings
Achieves +1 to +4 points improvement over state-of-the-art methods.
Effectively constructs high-quality pseudo-labels via optimal transport.
Enhances discriminative features with intra- and inter-cluster contrastive learning.
Abstract
New Intent Discovery (NID) strives to identify known and reasonably deduce novel intent groups in the open-world scenario. But current methods face issues with inaccurate pseudo-labels and poor representation learning, creating a negative feedback loop that degrades overall model performance, including accuracy and the adjusted rand index. To address the aforementioned challenges, we propose a Robust New Intent Discovery (RoNID) framework optimized by an EM-style method, which focuses on constructing reliable pseudo-labels and obtaining cluster-friendly discriminative representations. RoNID comprises two main modules: reliable pseudo-label generation module and cluster-friendly representation learning module. Specifically, the pseudo-label generation module assigns reliable synthetic labels by solving an optimal transport problem in the E-step, which effectively provides high-quality…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Machine Learning and Data Classification · Web Data Mining and Analysis
MethodsContrastive Learning
