Towards Adversarially Robust Vision-Language Models: Insights from   Design Choices and Prompt Formatting Techniques

Rishika Bhagwatkar; Shravan Nayak; Reza Bayat; Alexis Roger; Daniel Z; Kaplan; Pouya Bashivan; Irina Rish

arXiv:2407.11121·cs.CV·July 17, 2024

Towards Adversarially Robust Vision-Language Models: Insights from Design Choices and Prompt Formatting Techniques

Rishika Bhagwatkar, Shravan Nayak, Reza Bayat, Alexis Roger, Daniel Z, Kaplan, Pouya Bashivan, Irina Rish

PDF

Open Access

TL;DR

This paper investigates how design choices and prompt formatting can improve the adversarial robustness of vision-language models against image-based attacks, providing practical guidelines for safer deployment.

Contribution

It systematically analyzes design impacts on robustness and introduces novel prompt formatting techniques to enhance model resilience efficiently.

Findings

01

Design choices significantly affect robustness.

02

Prompt rephrasing improves attack resistance.

03

Enhanced robustness against Auto-PGD attacks.

Abstract

Vision-Language Models (VLMs) have witnessed a surge in both research and real-world applications. However, as they are becoming increasingly prevalent, ensuring their robustness against adversarial attacks is paramount. This work systematically investigates the impact of model design choices on the adversarial robustness of VLMs against image-based attacks. Additionally, we introduce novel, cost-effective approaches to enhance robustness through prompt formatting. By rephrasing questions and suggesting potential adversarial perturbations, we demonstrate substantial improvements in model robustness against strong image-based attacks such as Auto-PGD. Our findings provide important guidelines for developing more robust VLMs, particularly for deployment in safety-critical environments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning