TL;DR
ColorFool is a novel black-box adversarial attack that modifies image colors based on semantics, producing natural-looking perturbations that outperform existing methods in success rate, robustness, and transferability.
Contribution
It introduces a content-based color modification approach exploiting image semantics for effective black-box adversarial attacks.
Findings
Outperforms five state-of-the-art attacks in success rate.
Demonstrates robustness against defenses and transferability.
Effective on scene and object classification tasks.
Abstract
Adversarial attacks that generate small L_p-norm perturbations to mislead classifiers have limited success in black-box settings and with unseen classifiers. These attacks are also not robust to defenses that use denoising filters and to adversarial training procedures. Instead, adversarial attacks that generate unrestricted perturbations are more robust to defenses, are generally more successful in black-box settings and are more transferable to unseen classifiers. However, unrestricted perturbations may be noticeable to humans. In this paper, we propose a content-based black-box adversarial attack that generates unrestricted perturbations by exploiting image semantics to selectively modify colors within chosen ranges that are perceived as natural by humans. We show that the proposed approach, ColorFool, outperforms in terms of success rate, robustness to defense frameworks and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
ColorFool: Semantic Adversarial Colorization· youtube
