SGHA-Attack: Semantic-Guided Hierarchical Alignment for Transferable Targeted Attacks on Vision-Language Models

Haobo Wang; Weiqi Luo; Xiaojun Jia; Xiaochun Cao

arXiv:2602.01574·cs.CV·February 3, 2026

SGHA-Attack: Semantic-Guided Hierarchical Alignment for Transferable Targeted Attacks on Vision-Language Models

Haobo Wang, Weiqi Luo, Xiaojun Jia, Xiaochun Cao

PDF

Open Access

TL;DR

SGHA-Attack introduces a hierarchical, semantic-guided approach for targeted adversarial attacks on vision-language models, leveraging multiple references and intermediate-layer alignment to improve transferability and robustness.

Contribution

It proposes a novel hierarchical alignment framework that utilizes multiple references and intermediate features to enhance targeted attack transferability on VLMs.

Findings

01

Achieves stronger targeted transferability than prior methods.

02

Remains robust under preprocessing and purification defenses.

03

Utilizes a semantic-guided hierarchical alignment strategy.

Abstract

Large vision-language models (VLMs) are vulnerable to transfer-based adversarial perturbations, enabling attackers to optimize on surrogate models and manipulate black-box VLM outputs. Prior targeted transfer attacks often overfit surrogate-specific embedding space by relying on a single reference and emphasizing final-layer alignment, which underutilizes intermediate semantics and degrades transfer across heterogeneous VLMs. To address this, we propose SGHA-Attack, a Semantic-Guided Hierarchical Alignment framework that adopts multiple target references and enforces intermediate-layer consistency. Concretely, we generate a visually grounded reference pool by sampling a frozen text-to-image model conditioned on the target prompt, and then carefully select the Top-K most semantically relevant anchors under the surrogate to form a weighted mixture for stable optimization guidance.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis