Robust width: A lightweight and certifiable adversarial defense
Jonathan Peck, Bart Goossens

TL;DR
This paper introduces a lightweight, certifiable adversarial defense based on the robust width property, providing theoretical guarantees and strong empirical performance against adversarial attacks on ImageNet.
Contribution
It proposes a novel defense method using the robust width property that requires no additional training and offers theoretical robustness guarantees.
Findings
Outperforms state-of-the-art in black-box attacks for large perturbations
Matches top white-box robustness without extra data or training
Effective against $L^ abla$ perturbations on ImageNet
Abstract
Deep neural networks are vulnerable to so-called adversarial examples: inputs which are intentionally constructed to cause the model to make incorrect predictions or classifications. Adversarial examples are often visually indistinguishable from natural data samples, making them hard to detect. As such, they pose significant threats to the reliability of deep learning systems. In this work, we study an adversarial defense based on the robust width property (RWP), which was recently introduced for compressed sensing. We show that a specific input purification scheme based on the RWP gives theoretical robustness guarantees for images that are approximately sparse. The defense is easy to implement and can be applied to any existing model without additional training or finetuning. We empirically validate the defense on ImageNet against perturbations at perturbation budgets…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPhysical Unclonable Functions (PUFs) and Hardware Security
MethodsBalanced Selection
