FPT-Noise: Dynamic Scene-Aware Counterattack for Test-Time Adversarial Defense in Vision-Language Models
Jia Deng, Jin Li, Zhenhua Zhao, Shaowei Wang

TL;DR
This paper introduces FPT-Noise, a novel test-time defense method for vision-language models like CLIP, which dynamically generates attack-adaptive noise and uses scene-aware regulation to significantly improve adversarial robustness without retraining.
Contribution
It proposes a dynamic feature modulator, a feature perception threshold, and scene-aware regulation with test-time ensembling, advancing adversarial defense for VLMs without costly retraining.
Findings
Boosts robust accuracy from 0.07% to 56.86% under AutoAttack.
Maintains high performance on clean images (-1.1%).
Outperforms existing test-time defense methods.
Abstract
Vision-Language Models (VLMs), such as CLIP, have demonstrated remarkable zero-shot generalizability across diverse downstream tasks. However, recent studies have revealed that VLMs, including CLIP, are highly vulnerable to adversarial attacks, particularly on their visual modality. Traditional methods for improving adversarial robustness, such as adversarial training, involve extensive retraining and can be computationally expensive. In this paper, we propose a new Test-Time defense: Feature Perception Threshold Counterattack Noise (FPT-Noise), which enhances the adversarial robustness of CLIP without costly fine-tuning. Our core contributions are threefold: First, we introduce a Dynamic Feature Modulator that dynamically generate an image-specific and attack-adaptive noise intensity parameter. Second, We reanalyzed the image features of CLIP. When images are exposed to different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Generative Adversarial Networks and Image Synthesis · Ethics and Social Impacts of AI
