Finetune Like You Pretrain: Boosting Zero-shot Adversarial Robustness in Vision-language Models

Songlong Xing; Weijie Wang; Zhengyu Zhao; Jindong Gu; Philip Torr; Nicu Sebe

arXiv:2604.11576·cs.CV·April 14, 2026

Finetune Like You Pretrain: Boosting Zero-shot Adversarial Robustness in Vision-language Models

Songlong Xing, Weijie Wang, Zhengyu Zhao, Jindong Gu, Philip Torr, Nicu Sebe

PDF

1 Repo

TL;DR

This paper introduces AdvFLYP, a simple adversarial finetuning paradigm for vision-language models like CLIP, improving zero-shot adversarial robustness by aligning adversarial images with text and regularizing features, outperforming existing methods.

Contribution

AdvFLYP leverages CLIP's pretraining process for adversarial finetuning on web-collected image-text pairs, enhancing robustness and transferability across diverse datasets.

Findings

01

AdvFLYP outperforms mainstream practices on 14 downstream datasets.

02

Logit- and feature-level regularizations improve robustness and clean accuracy.

03

Regularization stabilizes adversarial image embeddings of noisy web images.

Abstract

Despite their impressive zero-shot abilities, vision-language models such as CLIP have been shown to be susceptible to adversarial attacks. To enhance its adversarial robustness, recent studies finetune the pretrained vision encoder of CLIP with adversarial examples on a proxy dataset such as ImageNet by aligning adversarial images with correct class labels. However, these methods overlook the important roles of training data distributions and learning objectives, resulting in reduced zero-shot capabilities and limited transferability of robustness across domains and datasets. In this work, we propose a simple yet effective paradigm AdvFLYP, which follows the training recipe of CLIP's pretraining process when performing adversarial finetuning to the model. Specifically, AdvFLYP finetunes CLIP with adversarial images created based on image-text pairs collected from the web, and match…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Sxing2/AdvFLYP
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.