PDA: Text-Augmented Defense Framework for Robust Vision-Language Models against Adversarial Image Attacks
Jingning Xu, Haochen Luo, Chen Liu

TL;DR
PDA is a training-free, text-augmented defense framework that enhances the robustness of vision-language models against adversarial image attacks during inference.
Contribution
Introducing PDA, a novel inference-time defense method using text augmentation to improve VLM robustness without model modification.
Findings
PDA improves robustness against various adversarial attacks across multiple VLM architectures.
PDA maintains high accuracy on clean images while defending against adversarial perturbations.
PDA is computationally efficient and applicable during inference without retraining.
Abstract
Vision-language models (VLMs) are vulnerable to adversarial image perturbations. Existing works based on adversarial training against task-specific adversarial examples are computationally expensive and often fail to generalize to unseen attack types. To address these limitations, we introduce Paraphrase-Decomposition-Aggregation (PDA), a training-free defense framework that leverages text augmentation to enhance VLM robustness under diverse adversarial image attacks. PDA performs prompt paraphrasing, question decomposition, and consistency aggregation entirely at test time, thus requiring no modification on the underlying models. To balance robustness and efficiency, we instantiate PDA as invariants that reduce the inference cost while retaining most of its robustness gains. Experiments on multiple VLM architectures and benchmarks for visual question answering, classification, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
