Formal Guarantees on the Robustness of a Classifier against Adversarial   Manipulation

Matthias Hein; Maksym Andriushchenko

arXiv:1705.08475·cs.LG·November 7, 2017·146 cites

Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation

Matthias Hein, Maksym Andriushchenko

PDF

Open Access

TL;DR

This paper provides formal, instance-specific robustness guarantees for classifiers against adversarial attacks and introduces a new regularization method that enhances robustness without sacrificing accuracy.

Contribution

It introduces the first formal robustness guarantees for classifiers and proposes the Cross-Lipschitz regularization to improve robustness in kernel methods and neural networks.

Findings

01

Instance-specific lower bounds on input manipulation for decision change

02

Cross-Lipschitz regularization improves robustness

03

Robustness gains without loss of prediction accuracy

Abstract

Recent work has shown that state-of-the-art classifiers are quite brittle, in the sense that a small adversarial change of an originally with high confidence correctly classified input leads to a wrong classification again with high confidence. This raises concerns that such classifiers are vulnerable to attacks and calls into question their usage in safety-critical systems. We show in this paper for the first time formal guarantees on the robustness of a classifier by giving instance-specific lower bounds on the norm of the input manipulation required to change the classifier decision. Based on this analysis we propose the Cross-Lipschitz regularization functional. We show that using this form of regularization in kernel methods resp. neural networks improves the robustness of the classifier without any loss in prediction performance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Physical Unclonable Functions (PUFs) and Hardware Security