IOCC: Aligning Semantic and Cluster Centers for Few-shot Short Text Clustering
Jixuan Yin, Zhihao Yao, Wenshuai Huo, Xinmiao Yu, Xiaocheng Feng, Bo Li

TL;DR
This paper introduces IOCC, a novel few-shot contrastive learning approach that aligns cluster centers with semantic centers in short text clustering, significantly improving performance on benchmark datasets.
Contribution
The paper proposes IOCC, combining interaction-enhanced optimal transport and center-aware contrastive learning to better capture semantic structures in short text clustering.
Findings
Outperforms previous methods on eight benchmark datasets.
Achieves up to 7.34% improvement on Biomedical dataset.
Enhances clustering stability and efficiency.
Abstract
In clustering tasks, it is essential to structure the feature space into clear, well-separated distributions. However, because short text representations have limited expressiveness, conventional methods struggle to identify cluster centers that truly capture each category's underlying semantics, causing the representations to be optimized in suboptimal directions. To address this issue, we propose IOCC, a novel few-shot contrastive learning method that achieves alignment between the cluster centers and the semantic centers. IOCC consists of two key modules: Interaction-enhanced Optimal Transport (IEOT) and Center-aware Contrastive Learning (CACL). Specifically, IEOT incorporates semantic interactions between individual samples into the conventional optimal transport problem, and generate pseudo-labels. Based on these pseudo-labels, we aggregate high-confidence samples to construct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
