NAP-Tuning: Neural Augmented Prompt Tuning for Adversarially Robust Vision-Language Models
Jiaming Zhang, Xin Wang, Xingjun Ma, Lingyu Qiu, Yu-Gang Jiang, and Jitao Sang

TL;DR
NAP-Tuning enhances the robustness of vision-language models against adversarial attacks by extending prompt tuning to multi-modal, multi-layer architectures with feature purification, significantly improving adversarial defense performance.
Contribution
Introduces NAP-Tuning, a multi-modal, multi-layer prompt tuning framework with feature purification for adversarial robustness in VLMs, extending prior AdvPT work.
Findings
Outperforms existing methods on various datasets and attacks.
Achieves 33.5% and 33.0% improvements on AutoAttack benchmark.
Maintains competitive accuracy on clean data.
Abstract
Vision-Language Models (VLMs) such as CLIP have demonstrated remarkable capabilities in understanding relationships between visual and textual data through joint embedding spaces. Despite their effectiveness, these models remain vulnerable to adversarial attacks, particularly in the image modality, posing significant security concerns. Building upon our previous work on Adversarial Prompt Tuning (AdvPT), which introduced learnable text prompts to enhance adversarial robustness in VLMs without extensive parameter training, we present a significant extension by introducing the Neural Augmentor framework for Multi-modal Adversarial Prompt Tuning (NAP-Tuning).Our key innovations include: (1) extending AdvPT from text-only to multi-modal prompting across both text and visual modalities, (2) expanding from single-layer to multi-layer prompt architectures, and (3) proposing a novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Multimodal Machine Learning Applications
MethodsContrastive Language-Image Pre-training
