Gradient Methods Provably Converge to Non-Robust Networks

Gal Vardi; Gilad Yehudai; Ohad Shamir

arXiv:2202.04347·cs.LG·October 5, 2022·6 cites

Gradient Methods Provably Converge to Non-Robust Networks

Gal Vardi, Gilad Yehudai, Ohad Shamir

PDF

Open Access 1 Video

TL;DR

This paper proves that gradient flow training of depth-2 ReLU networks naturally leads to non-robust models susceptible to small adversarial perturbations, due to an implicit bias towards margin maximization.

Contribution

It demonstrates that the implicit bias in gradient flow training favors non-robust networks, even when robust solutions exist, revealing a fundamental reason for adversarial vulnerability.

Findings

01

Gradient flow leads to non-robust networks in certain settings.

02

Networks satisfying max-margin KKT conditions are non-robust.

03

Implicit bias towards margin maximization causes vulnerability.

Abstract

Despite a great deal of research, it is still unclear why neural networks are so susceptible to adversarial examples. In this work, we identify natural settings where depth- $2$ ReLU networks trained with gradient flow are provably non-robust (susceptible to small adversarial $ℓ_{2}$ -perturbations), even when robust networks that classify the training dataset correctly exist. Perhaps surprisingly, we show that the well-known implicit bias towards margin maximization induces bias towards non-robust networks, by proving that every network which satisfies the KKT conditions of the max-margin problem is non-robust.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Gradient Methods Provably Converge to Non-Robust Networks· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Machine Learning and Algorithms · COVID-19 diagnosis using AI