Language-Driven Anchors for Zero-Shot Adversarial Robustness
Xiao Li, Wei Zhang, Yining Liu, Zhanhao Hu, Bo Zhang and, Xiaolin Hu

TL;DR
This paper introduces LAAT, a novel zero-shot adversarial training method leveraging large vision-language models like CLIP, which improves robustness of DNNs to adversarial attacks on unseen categories by using semantic text anchors.
Contribution
LAAT is the first approach to incorporate language-driven anchors for zero-shot adversarial robustness, addressing issues of high similarity in text encoders with a new expansion and alignment loss.
Findings
LAAT significantly outperforms existing methods in zero-shot adversarial robustness.
The proposed expansion and alignment techniques effectively mitigate high cosine similarity issues.
Large-scale multimodal models can enhance adversarial robustness without labeled data during training.
Abstract
Deep Neural Networks (DNNs) are known to be susceptible to adversarial attacks. Previous researches mainly focus on improving adversarial robustness in the fully supervised setting, leaving the challenging domain of zero-shot adversarial robustness an open question. In this work, we investigate this domain by leveraging the recent advances in large vision-language models, such as CLIP, to introduce zero-shot adversarial robustness to DNNs. We propose LAAT, a Language-driven, Anchor-based Adversarial Training strategy. LAAT utilizes the features of a text encoder for each category as fixed anchors (normalized feature embeddings) for each category, which are then employed for adversarial training. By leveraging the semantic consistency of the text encoders, LAAT aims to enhance the adversarial robustness of the image model on novel categories. However, naively using text encoders leads to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Integrated Circuits and Semiconductor Failure Analysis · High-Velocity Impact and Material Behavior
