Language Guided Adversarial Purification

Himanshu Singh; A V Subramanyam

arXiv:2309.10348·cs.LG·April 10, 2026

Language Guided Adversarial Purification

Himanshu Singh, A V Subramanyam

PDF

TL;DR

The paper introduces LGAP, a novel defense method against adversarial attacks that uses pre-trained diffusion models and caption generators to improve robustness without extensive attack-specific training.

Contribution

LGAP leverages language guidance with pre-trained diffusion models for adversarial purification, offering a versatile and effective defense without specialized network training.

Findings

01

LGAP outperforms many existing adversarial defenses.

02

The method enhances robustness against strong adversarial attacks.

03

It does not require attack-specific training.

Abstract

Adversarial purification using generative models demonstrates strong adversarial defense performance. These methods are classifier and attack-agnostic, making them versatile but often computationally intensive. Recent strides in diffusion and score networks have improved image generation and, by extension, adversarial purification. Another highly efficient class of adversarial defense methods known as adversarial training requires specific knowledge of attack vectors, forcing them to be trained extensively on adversarial examples. To overcome these limitations, we introduce a new framework, namely Language Guided Adversarial Purification (LGAP), utilizing pre-trained diffusion models and caption generators to defend against adversarial attacks. Given an input image, our method first generates a caption, which is then used to guide the adversarial purification process through a diffusion…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.