Robust Prompt Tuning for Vision-Language Models with Mild Semantic Noise

Yansheng Gao; Yufei Zheng; Shengsheng Wang

arXiv:2508.04677·cs.CV·October 3, 2025

Robust Prompt Tuning for Vision-Language Models with Mild Semantic Noise

Yansheng Gao, Yufei Zheng, Shengsheng Wang

PDF

Open Access

TL;DR

This paper introduces ANPrompt, a novel prompt tuning framework for vision-language models that actively incorporates weak semantic noise and stabilizes visual semantics, significantly improving robustness and generalization to unseen categories.

Contribution

The paper proposes ANPrompt, which integrates weak semantic noise and a noise-resistant visual prompt, along with a new loss function, to enhance robustness and generalization in prompt tuning.

Findings

01

Outperforms existing methods on 11 benchmarks.

02

Enhances robustness to semantic noise.

03

Improves generalization to unseen categories.

Abstract

Prompt tuning has shown promising results, but its robustness and generalization to unseen categories remain limited. Through our experiments, we demonstrate that the complete removal of semantic noise is a key factor restricting robustness. Existing methods typically suppress or filter out semantic noise in the prompt space, inadvertently hindering the model's robustness and its ability to generalize to unseen categories. To address this, we propose ANPrompt, a robust prompt tuning framework that actively incorporates weak semantic noise. By clustering weakly perturbed features into noise prompts and integrating them with learnable tokens in both the text and vision encoders, ANPrompt ensures controlled exposure to semantic variations. To enhance the visual pathway, we introduce the Noise-Resistant Visual Prompt Prototype (NRVPP), which stabilizes visual semantics under weak…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Natural Language Processing Techniques