Cluster-Aware Neural Collapse Prompt Tuning for Long-Tailed Generalization of Vision-Language Models
Boyang Guo, Liang Li, Lin Peng, Yuhan Gao, Xichun Sheng, Chenggang Yan

TL;DR
This paper introduces cluster-aware neural collapse prompt tuning (CPT), a method that improves tail-class discriminability in vision-language models for long-tailed datasets without losing overall generalization.
Contribution
The paper proposes a novel cluster-aware prompt tuning method that enhances tail-class discriminability by designing a cluster-invariant space and neural-collapse-driven optimization.
Findings
CPT outperforms SOTA methods on 11 datasets.
CPT shows stronger performance on long-tail classes.
CPT generalizes well to unseen classes.
Abstract
Prompt learning has emerged as an efficient alternative to fine-tuning pre-trained vision-language models (VLMs). Despite its promise, current methods still struggle to maintain tail-class discriminability when adapting to class-imbalanced datasets. In this work, we propose cluster-aware neural collapse prompt tuning (CPT), which enhances the discriminability of tail classes in prompt-tuned VLMs without sacrificing their overall generalization. First, we design a cluster-invariant space by mining semantic assignments from the pre-trained VLM and mapping them to prompt-tuned features. This computes cluster-level boundaries and restricts the constraints to local neighborhoods, which reduces interference with the global semantic structure of the pre-trained VLM. Second, we introduce neural-collapse-driven discriminability optimization with three losses: textual Equiangular Tight Frame…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
