Scaling Laws for Black box Adversarial Attacks

Chuan Liu; Huanran Chen; Yichi Zhang; Jun Zhu; Yinpeng Dong

arXiv:2411.16782·cs.LG·December 19, 2025

Scaling Laws for Black box Adversarial Attacks

Chuan Liu, Huanran Chen, Yichi Zhang, Jun Zhu, Yinpeng Dong

PDF

Open Access

TL;DR

This paper uncovers a universal log-linear scaling law for black-box adversarial attack success rates, demonstrating that increasing ensemble size significantly enhances attack effectiveness across various models and defenses.

Contribution

It introduces the first large-scale empirical study revealing a fundamental scaling law for ensemble-based black-box attacks, supported by theoretical analysis and extensive experiments.

Findings

01

Attack success rate scales linearly with the logarithm of ensemble size

02

Scaling improves transferability across classifiers, defenses, and MLLMs

03

Achieves over 80% success on proprietary models like GPT-4o

Abstract

Adversarial examples exhibit cross-model transferability, enabling threatening black-box attacks on commercial models. Model ensembling, which attacks multiple surrogate models, is a known strategy to improve this transferability. However, prior studies typically use small, fixed ensembles, which leaves open an intriguing question of whether scaling the number of surrogate models can further improve black-box attacks. In this work, we conduct the first large-scale empirical study of this question. We show that by resolving gradient conflict with advanced optimizers, we discover a robust and universal log-linear scaling law through both theoretical analysis and empirical evaluations: the Attack Success Rate (ASR) scales linearly with the logarithm of the ensemble size $T$ . We rigorously verify this law across standard classifiers, SOTA defenses, and MLLMs, and find that scaling distills…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCryptographic Implementations and Security · Physical Unclonable Functions (PUFs) and Hardware Security · Adversarial Robustness in Machine Learning

MethodsADaptive gradient method with the OPTimal convergence rate