Pre-trained Model Guided Fine-Tuning for Zero-Shot Adversarial   Robustness

Sibo Wang; Jie Zhang; Zheng Yuan; Shiguang Shan

arXiv:2401.04350·cs.CV·April 11, 2024·1 cites

Pre-trained Model Guided Fine-Tuning for Zero-Shot Adversarial Robustness

Sibo Wang, Jie Zhang, Zheng Yuan, Shiguang Shan

PDF

Open Access 1 Repo

TL;DR

This paper introduces PMG-AFT, a novel fine-tuning method guided by pre-trained models to enhance zero-shot adversarial robustness of vision-language models like CLIP, without sacrificing generalization.

Contribution

The paper proposes a pre-trained model guided adversarial fine-tuning approach that preserves generalization features while improving robustness against adversarial attacks.

Findings

01

Significantly outperforms state-of-the-art methods in zero-shot robustness.

02

Improves clean accuracy alongside adversarial robustness.

03

Demonstrates effectiveness across 15 zero-shot datasets.

Abstract

Large-scale pre-trained vision-language models like CLIP have demonstrated impressive performance across various tasks, and exhibit remarkable zero-shot generalization capability, while they are also vulnerable to imperceptible adversarial examples. Existing works typically employ adversarial training (fine-tuning) as a defense method against adversarial examples. However, direct application to the CLIP model may result in overfitting, compromising the model's capacity for generalization. In this paper, we propose Pre-trained Model Guided Adversarial Fine-Tuning (PMG-AFT) method, which leverages supervision from the original pre-trained model by carefully designing an auxiliary branch, to enhance the model's zero-shot adversarial robustness. Specifically, PMG-AFT minimizes the distance between the features of adversarial examples in the target model and those in the pre-trained model,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

serendipity1122/pre-trained-model-guided-fine-tuning-for-zero-shot-adversarial-robustness
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning

MethodsContrastive Language-Image Pre-training