Language-Driven Anchors for Zero-Shot Adversarial Robustness

Xiao Li; Wei Zhang; Yining Liu; Zhanhao Hu; Bo Zhang and; Xiaolin Hu

arXiv:2301.13096·cs.CV·March 12, 2024

Language-Driven Anchors for Zero-Shot Adversarial Robustness

Xiao Li, Wei Zhang, Yining Liu, Zhanhao Hu, Bo Zhang and, Xiaolin Hu

PDF

Open Access 1 Repo

TL;DR

This paper introduces LAAT, a novel zero-shot adversarial training method leveraging large vision-language models like CLIP, which improves robustness of DNNs to adversarial attacks on unseen categories by using semantic text anchors.

Contribution

LAAT is the first approach to incorporate language-driven anchors for zero-shot adversarial robustness, addressing issues of high similarity in text encoders with a new expansion and alignment loss.

Findings

01

LAAT significantly outperforms existing methods in zero-shot adversarial robustness.

02

The proposed expansion and alignment techniques effectively mitigate high cosine similarity issues.

03

Large-scale multimodal models can enhance adversarial robustness without labeled data during training.

Abstract

Deep Neural Networks (DNNs) are known to be susceptible to adversarial attacks. Previous researches mainly focus on improving adversarial robustness in the fully supervised setting, leaving the challenging domain of zero-shot adversarial robustness an open question. In this work, we investigate this domain by leveraging the recent advances in large vision-language models, such as CLIP, to introduce zero-shot adversarial robustness to DNNs. We propose LAAT, a Language-driven, Anchor-based Adversarial Training strategy. LAAT utilizes the features of a text encoder for each category as fixed anchors (normalized feature embeddings) for each category, which are then employed for adversarial training. By leveraging the semantic consistency of the text encoders, LAAT aims to enhance the adversarial robustness of the image model on novel categories. However, naively using text encoders leads to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lixiaothu/laat
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Integrated Circuits and Semiconductor Failure Analysis · High-Velocity Impact and Material Behavior