Black-box Certification and Learning under Adversarial Perturbations
Hassan Ashtiani, Vinayak Pathak, Ruth Urner

TL;DR
This paper investigates the challenges of certifying and learning robust classifiers under adversarial perturbations in black-box settings, providing theoretical insights and new frameworks for certification and attack strategies.
Contribution
It introduces a PAC-type framework for black-box certification, analyzes possibility and impossibility results, and explores the relationship between adversarial attacks and robust learning.
Findings
Possibility and impossibility results for learning VC-classes under adversarial perturbations.
A new black-box certification setting with limited query budget.
Existence of polynomial-query adversaries implies sample-efficient robust learners.
Abstract
We formally study the problem of classification under adversarial perturbations from a learner's perspective as well as a third-party who aims at certifying the robustness of a given black-box classifier. We analyze a PAC-type framework of semi-supervised learning and identify possibility and impossibility results for proper learning of VC-classes in this setting. We further introduce a new setting of black-box certification under limited query budget, and analyze this for various classes of predictors and perturbation. We also consider the viewpoint of a black-box adversary that aims at finding adversarial examples, showing that the existence of an adversary with polynomial query complexity can imply the existence of a sample efficient robust learner.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning and Algorithms · Adversarial Robustness in Machine Learning
