Navigating the Trade-off: A Synthesis of Defensive Strategies for Zero-Shot Adversarial Robustness in Vision-Language Models

Zane Xu; Jason Sun

arXiv:2508.05237·cs.CV·August 8, 2025

Navigating the Trade-off: A Synthesis of Defensive Strategies for Zero-Shot Adversarial Robustness in Vision-Language Models

Zane Xu, Jason Sun

PDF

TL;DR

This paper reviews defense strategies for improving zero-shot adversarial robustness in vision-language models, highlighting the trade-offs, evolution of methods, and future research directions.

Contribution

It synthesizes key defense paradigms and methods for VLMs, providing a comprehensive overview of the field's evolution and challenges.

Findings

01

Analysis of adversarial fine-tuning and test-time defenses

02

Comparison of alignment-preserving and embedding re-engineering methods

03

Identification of future directions like hybrid strategies and adversarial pre-training

Abstract

This report synthesizes eight seminal papers on the zero-shot adversarial robustness of vision-language models (VLMs) like CLIP. A central challenge in this domain is the inherent trade-off between enhancing adversarial robustness and preserving the model's zero-shot generalization capabilities. We analyze two primary defense paradigms: Adversarial Fine-Tuning (AFT), which modifies model parameters, and Training-Free/Test-Time Defenses, which preserve them. We trace the evolution from alignment-preserving methods (TeCoA) to embedding space re-engineering (LAAT, TIMA), and from input heuristics (AOM, TTC) to latent-space purification (CLIPure). Finally, we identify key challenges and future directions including hybrid defense strategies and adversarial pre-training.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.