TL;DR
This paper introduces a novel adversarial attack method called STM that uses style transfer to generate diverse, cross-domain images, significantly enhancing the transferability of adversarial examples across different neural network models.
Contribution
The paper proposes a new attack technique leveraging style transfer for improved adversarial transferability, combining domain augmentation with semantic consistency.
Findings
STM outperforms state-of-the-art input transformation attacks on ImageNet datasets.
The method enhances transferability on both trained and adversarially trained models.
Using style transfer increases the diversity and effectiveness of adversarial examples.
Abstract
Deep neural networks are vulnerable to adversarial examples crafted by applying human-imperceptible perturbations on clean inputs. Although many attack methods can achieve high success rates in the white-box setting, they also exhibit weak transferability in the black-box setting. Recently, various methods have been proposed to improve adversarial transferability, in which the input transformation is one of the most effective methods. In this work, we notice that existing input transformation-based works mainly adopt the transformed data in the same domain for augmentation. Inspired by domain generalization, we aim to further improve the transferability using the data augmented from different domains. Specifically, a style transfer network can alter the distribution of low-level visual features in an image while preserving semantic content for humans. Hence, we propose a novel attack…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
