Intrinsic Gradient Suppression for Label-Noise Prompt Tuning in Vision-Language Models
Jiayu Li, Jiaxin Qi, Sheng Zhou, Jiaqiang Huang, Xiansheng Hua

TL;DR
This paper introduces DSPT, a hyperparameter-free method that adaptively suppresses gradients from noisy samples in prompt tuning for vision-language models, enhancing robustness against label noise.
Contribution
Proposes Double-Softmax Prompt Tuning (DSPT), a novel intrinsic gradient suppression technique that improves noise robustness without additional hyperparameters.
Findings
DSPT outperforms existing methods on noisy benchmarks.
It effectively suppresses gradients from mislabeled samples.
Theoretical analysis confirms adaptive gradient suppression.
Abstract
Contrastive vision-language models like CLIP exhibit remarkable zero-shot generalization. However, prompt tuning remains highly sensitive to label noise, as mislabeled samples generate disproportionately large gradients that can overwhelm pre-trained priors. We argue that because CLIP already provides a near-optimal initialization, adaptation should be inherently conservative, particularly against the extreme gradient updates common in noisy settings. To this end, we propose Double-Softmax Prompt Tuning (DSPT), a hyperparameter-free method for intrinsic gradient suppression. By applying a sequential probabilistic normalization, DSPT induces a self-adaptive saturation zone that suppresses gradients from high-error noisy samples while maintaining informative updates. We also provide both theoretical analysis and empirical evidence about how this mechanism achieves adaptive suppression.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
