Filtered-ViT: A Robust Defense Against Multiple Adversarial Patch Attacks
Aja Khanal, Ahmed Faid, Apurva Narayan

TL;DR
Filtered-ViT is a novel vision transformer architecture that effectively defends against multiple adversarial patch attacks and natural artifacts, enhancing robustness in safety-critical applications.
Contribution
Introduces Filtered-ViT with SMART-VMF, a spatially adaptive filtering mechanism, providing unified robustness against synthetic adversarial patches and real-world artifacts.
Findings
Achieves 79.8% clean accuracy and 46.3% robustness on ImageNet with multi-patch attacks.
Outperforms existing defenses in multi-patch robustness scenarios.
Effectively mitigates natural artifacts in medical imaging without loss of diagnostic detail.
Abstract
Deep learning vision systems are increasingly deployed in safety-critical domains such as healthcare, yet they remain vulnerable to small adversarial patches that can trigger misclassifications. Most existing defenses assume a single patch and fail when multiple localized disruptions occur, the type of scenario adversaries and real-world artifacts often exploit. We propose Filtered-ViT, a new vision transformer architecture that integrates SMART Vector Median Filtering (SMART-VMF), a spatially adaptive, multi-scale, robustness-aware mechanism that enables selective suppression of corrupted regions while preserving semantic detail. On ImageNet with LaVAN multi-patch attacks, Filtered-ViT achieves 79.8% clean accuracy and 46.3% robust accuracy under four simultaneous 1\% patches, outperforming existing defenses. Beyond synthetic benchmarks, a real-world case study on radiographic medical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Advanced Image Processing Techniques
