Adversarial Robustification via Text-to-Image Diffusion Models

Daewon Choi; Jongheon Jeong; Huiwon Jang; Jinwoo Shin

arXiv:2407.18658·cs.CV·July 29, 2024

Adversarial Robustification via Text-to-Image Diffusion Models

Daewon Choi, Jongheon Jeong, Huiwon Jang, Jinwoo Shin

PDF

Open Access 1 Repo

TL;DR

This paper introduces a scalable, data-free method leveraging text-to-image diffusion models to enhance adversarial robustness of neural classifiers, including CLIP, without requiring training data.

Contribution

It proposes a novel, model-agnostic approach using diffusion models as denoisers to improve adversarial robustness without data, outperforming prior data-dependent methods.

Findings

01

Improved provable adversarial robustness of CLIP classifiers.

02

Achieved robustness gains while maintaining classification accuracy.

03

Applicable to various visual classifiers beyond CLIP.

Abstract

Adversarial robustness has been conventionally believed as a challenging property to encode for neural networks, requiring plenty of training data. In the recent paradigm of adopting off-the-shelf models, however, access to their training data is often infeasible or not practical, while most of such models are not originally trained concerning adversarial robustness. In this paper, we develop a scalable and model-agnostic solution to achieve adversarial robustness without using any data. Our intuition is to view recent text-to-image diffusion models as "adaptable" denoisers that can be optimized to specify target tasks. Based on this, we propose: (a) to initiate a denoise-and-classify pipeline that offers provable guarantees against adversarial attacks, and (b) to leverage a few synthetic reference images generated from the text-to-image model that enables novel adaptation schemes. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

choidae1/robustify-t2i
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Media Forensic Detection · Adversarial Robustness in Machine Learning · Image Processing Techniques and Applications

MethodsDiffusion · Contrastive Language-Image Pre-training