Unrestricted Adversarial Examples
Tom B. Brown, Nicholas Carlini, Chiyuan Zhang, Catherine Olsson, Paul, Christiano, Ian Goodfellow

TL;DR
This paper presents a new contest framework for evaluating machine learning model robustness against unconstrained adversarial attacks, emphasizing real-world worst-case scenarios beyond norm-bounded perturbations.
Contribution
It introduces a novel two-player contest with an unambiguous dataset to assess ML robustness against unconstrained adversaries, expanding evaluation beyond traditional norm-constrained methods.
Findings
Proposed a new unambiguous dataset for adversarial testing
Established a contest framework to evaluate robustness comprehensively
Encouraged development of defenses against unconstrained adversarial inputs
Abstract
We introduce a two-player contest for evaluating the safety and robustness of machine learning systems, with a large prize pool. Unlike most prior work in ML robustness, which studies norm-constrained adversaries, we shift our focus to unconstrained adversaries. Defenders submit machine learning models, and try to achieve high accuracy and coverage on non-adversarial data while making no confident mistakes on adversarial inputs. Attackers try to subvert defenses by finding arbitrary unambiguous inputs where the model assigns an incorrect label with high confidence. We propose a simple unambiguous dataset ("bird-or- bicycle") to use as part of this contest. We hope this contest will help to more comprehensively evaluate the worst-case adversarial risk of machine learning models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques
