Cluster-Aware Neural Collapse Prompt Tuning for Long-Tailed Generalization of Vision-Language Models

Boyang Guo; Liang Li; Lin Peng; Yuhan Gao; Xichun Sheng; Chenggang Yan

arXiv:2605.11939·cs.CV·May 13, 2026

Cluster-Aware Neural Collapse Prompt Tuning for Long-Tailed Generalization of Vision-Language Models

Boyang Guo, Liang Li, Lin Peng, Yuhan Gao, Xichun Sheng, Chenggang Yan

PDF

TL;DR

This paper introduces cluster-aware neural collapse prompt tuning (CPT), a method that improves tail-class discriminability in vision-language models for long-tailed datasets without losing overall generalization.

Contribution

The paper proposes a novel cluster-aware prompt tuning method that enhances tail-class discriminability by designing a cluster-invariant space and neural-collapse-driven optimization.

Findings

01

CPT outperforms SOTA methods on 11 datasets.

02

CPT shows stronger performance on long-tail classes.

03

CPT generalizes well to unseen classes.

Abstract

Prompt learning has emerged as an efficient alternative to fine-tuning pre-trained vision-language models (VLMs). Despite its promise, current methods still struggle to maintain tail-class discriminability when adapting to class-imbalanced datasets. In this work, we propose cluster-aware neural collapse prompt tuning (CPT), which enhances the discriminability of tail classes in prompt-tuned VLMs without sacrificing their overall generalization. First, we design a cluster-invariant space by mining semantic assignments from the pre-trained VLM and mapping them to prompt-tuned features. This computes cluster-level boundaries and restricts the constraints to local neighborhoods, which reduces interference with the global semantic structure of the pre-trained VLM. Second, we introduce neural-collapse-driven discriminability optimization with three losses: textual Equiangular Tight Frame…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.