Random Position Adversarial Patch for Vision Transformers
Mingzhen Shao

TL;DR
This paper introduces G-Patch, a GAN-based adversarial patch that can attack vision transformers from any position, demonstrating robustness and practicality in digital and physical environments.
Contribution
It proposes a novel GAN-like method to generate position-agnostic adversarial patches for vision transformers, overcoming previous alignment constraints.
Findings
Effective universal attacks on vision transformers in digital scenarios
Robust physical-world attack performance under various conditions
Patch exhibits resistance to brightness, color, and noise variations
Abstract
Previous studies have shown the vulnerability of vision transformers to adversarial patches, but these studies all rely on a critical assumption: the attack patches must be perfectly aligned with the patches used for linear projection in vision transformers. Due to this stringent requirement, deploying adversarial patches for vision transformers in the physical world becomes impractical, unlike their effectiveness on CNNs. This paper proposes a novel method for generating an adversarial patch (G-Patch) that overcomes the alignment constraint, allowing the patch to launch a targeted attack at any position within the field of view. Specifically, instead of directly optimizing the patch using gradients, we employ a GAN-like structure to generate the adversarial patch. Our experiments show the effectiveness of the adversarial patch in achieving universal attacks on vision transformers, both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Physical Unclonable Functions (PUFs) and Hardware Security · Digital Media Forensic Detection
