PatchGuard++: Efficient Provable Attack Detection against Adversarial Patches
Chong Xiang, Prateek Mittal

TL;DR
PatchGuard++ is a novel detection method that enhances provable robustness against adversarial patches by using masked feature evaluation, significantly improving accuracy on multiple datasets.
Contribution
It extends PatchGuard by incorporating masked feature analysis and pattern extraction, boosting both robustness and accuracy against localized adversarial attacks.
Findings
Improves provable robustness on ImageNet, ImageNette, and CIFAR-10.
Enhances clean accuracy while maintaining attack detection capabilities.
Demonstrates significant robustness gains over prior defenses.
Abstract
An adversarial patch can arbitrarily manipulate image pixels within a restricted region to induce model misclassification. The threat of this localized attack has gained significant attention because the adversary can mount a physically-realizable attack by attaching patches to the victim object. Recent provably robust defenses generally follow the PatchGuard framework by using CNNs with small receptive fields and secure feature aggregation for robust model predictions. In this paper, we extend PatchGuard to PatchGuard++ for provably detecting the adversarial patch attack to boost both provable robust accuracy and clean accuracy. In PatchGuard++, we first use a CNN with small receptive fields for feature extraction so that the number of features corrupted by the adversarial patch is bounded. Next, we apply masks in the feature space and evaluate predictions on all possible masked…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Security and Verification in Computing
