Adversarial attacks against Modern Vision-Language Models
Alejandro Paredes La Torre

TL;DR
This paper evaluates the adversarial robustness of open-source vision-language models in a simulated e-commerce environment, revealing significant vulnerabilities in some models and highlighting the importance of security assessments before deployment.
Contribution
It provides the first comprehensive analysis of gradient-based adversarial attacks on open-source VLM agents in a realistic setting, comparing their robustness.
Findings
LLaVA-v1.5-7B is vulnerable to gradient-based attacks with over 50% success rate.
Qwen2.5-VL-7B shows higher robustness with less than 16% success rate.
Architectural differences influence the adversarial resilience of VLMs.
Abstract
We study adversarial robustness of open-source vision-language model (VLM) agents deployed in a self-contained e-commerce environment built to simulate realistic pre-deployment conditions. We evaluate two agents, LLaVA-v1.5-7B and Qwen2.5-VL-7B, under three gradient-based attacks: the Basic Iterative Method (BIM), Projected Gradient Descent (PGD), and a CLIP-based spectral attack. Against LLaVA, all three attacks achieve substantial attack success rates (52.6%, 53.8%, and 66.9% respectively), demonstrating that simple gradient-based methods pose a practical threat to open-source VLM agents. Qwen2.5-VL proves significantly more robust across all attacks (6.5%, 7.7%, and 15.5%), suggesting meaningful architectural differences in adversarial resilience between open-source VLM families. These findings have direct implications for the security evaluation of VLM agents prior to commercial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Ethics and Social Impacts of AI · Multimodal Machine Learning Applications
