Crafting Adversarial Inputs for Large Vision-Language Models Using Black-Box Optimization

Jiwei Guan; Haibo Jin; Haohan Wang

arXiv:2601.01747·cs.CR·January 23, 2026

Crafting Adversarial Inputs for Large Vision-Language Models Using Black-Box Optimization

Jiwei Guan, Haibo Jin, Haohan Wang

PDF

Open Access 1 Video

TL;DR

This paper introduces a black-box attack method using zeroth-order optimization to craft adversarial inputs for large vision-language models, exposing vulnerabilities without requiring model access.

Contribution

The paper proposes ZO-SPSA, a gradient-free, model-agnostic black-box attack method that reduces resource use and improves transferability for attacking LVLMs.

Findings

01

Achieved up to 83.0% success rate on InstructBLIP

02

Generated adversarial examples with imperceptible perturbations

03

Demonstrated strong transferability of attacks across models

Abstract

Recent advancements in Large Vision-Language Models (LVLMs) have shown groundbreaking capabilities across diverse multimodal tasks. However, these models remain vulnerable to adversarial jailbreak attacks, where adversaries craft subtle perturbations to bypass safety mechanisms and trigger harmful outputs. Existing white-box attacks methods require full model accessibility, suffer from computing costs and exhibit insufficient adversarial transferability, making them impractical for real-world, black-box settings. To address these limitations, we propose a black-box jailbreak attack on LVLMs via Zeroth-Order optimization using Simultaneous Perturbation Stochastic Approximation (ZO-SPSA). ZO-SPSA provides three key advantages: (i) gradient-free approximation by input-output interactions without requiring model knowledge, (ii) model-agnostic optimization without the surrogate model and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Crafting Adversarial Inputs for Large Vision-Language Models Using Black-Box Optimization· underline

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis