Frequency-Domain Regularized Adversarial Alignment for Transferable Attacks against Closed-Source MLLMs

Leitao Yuan; Qinghua Mao; Daizong Liu; Kun Wang; Wenjie Wang; Yan Teng; Jing Shao; Dongrui Liu

arXiv:2605.21541·cs.CR·May 22, 2026

Frequency-Domain Regularized Adversarial Alignment for Transferable Attacks against Closed-Source MLLMs

Leitao Yuan, Qinghua Mao, Daizong Liu, Kun Wang, Wenjie Wang, Yan Teng, Jing Shao, Dongrui Liu

PDF

TL;DR

This paper introduces FRA-Attack, a frequency-domain regularization method that enhances transfer-based adversarial attacks on multimodal large language models by focusing on intrinsic visual features and model-agnostic gradient modulation.

Contribution

FRA-Attack employs a unified frequency-domain approach with high-pass and low-pass regularizations to improve attack transferability across diverse MLLMs.

Findings

01

FRA-Attack outperforms existing methods on 15 MLLMs.

02

Achieves state-of-the-art transferability on GPT-5.4, Claude-Opus-4.6, and Gemini-3-flash.

03

Frequency-domain regularization effectively captures intrinsic visual focus.

Abstract

Multimodal large language models (MLLMs) remain vulnerable to transfer-based targeted attacks, where perturbations optimized on open-source surrogate encoders can generalize to closed-source MLLMs. A key challenge for improving adversarial transferability is to effectively capture the intrinsic visual focus shared across different models, such that perturbations align with transferable semantic cues rather than surrogate-specific behaviors. However, existing methods suffer from spatial-domain feature redundancy and surrogate-specific gradient signals, thereby hindering cross-model transferability. In this paper, we propose FRA-Attack, which addresses both challenges from a unified frequency-domain regularization perspective. For feature alignment, a high-pass DCT objective on patch features suppresses redundant global structures and concentrates the loss on the high-frequency band that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.