Out-of-the-box: Black-box Causal Attacks on Object Detectors
Melane Navaratnarajah, David A. Kelly, Hana Chockler

TL;DR
BlackCAtt is a black-box, causal pixel-based attack method on object detectors that produces explainable, imperceptible adversarial examples, outperforming existing black-box techniques in size and perceptibility.
Contribution
Introduces BlackCAtt, a novel causal pixel-based black-box attack algorithm that enhances attack effectiveness and explainability on object detectors.
Findings
BlackCAtt achieves comparable or better attack success rates than other black-box methods.
Targeting causal pixels results in smaller, less perceptible attacks.
BlackCAtt reduces the average $L_0$ distance from 0.987 to 0.072 while maintaining success.
Abstract
Adversarial perturbations are a useful way to expose vulnerabilities in object detectors. Existing perturbation methods are frequently white-box, architecture specific and use a loss function. More importantly, while they are often successful, it is rarely clear why they work. Insights into the mechanism of this success would allow developers to understand and analyze these attacks, as well as fine-tune the model to prevent them. This paper presents BlackCAtt, a black-box algorithm and tool, which uses minimal, causally sufficient pixel sets to construct explainable, imperceptible, reproducible, architecture-agnostic attacks on object detectors. We evaluate BlackCAtt on standard benchmarks and compare it to other black-box adversarial attacks methods. When BlackCAtt has access only to the position and label of a bounding box, it produces attacks that are comparable or better to those…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
