Efficiently Training Low-Curvature Neural Networks
Suraj Srinivas, Kyle Matoba, Himabindu Lakkaraju, Francois Fleuret

TL;DR
This paper introduces low-curvature neural networks (LCNNs) that achieve enhanced robustness and gradient stability by minimizing a curvature bound, using novel architectural components, without sacrificing predictive accuracy.
Contribution
The paper proposes a method to train low-curvature neural networks by minimizing a curvature bound and introduces new layers like centered-softplus and Lipschitz batch normalization.
Findings
LCNNs exhibit significantly lower curvature than standard models.
LCNNs demonstrate increased adversarial robustness.
LCNNs maintain comparable predictive performance to traditional networks.
Abstract
The highly non-linear nature of deep neural networks causes them to be susceptible to adversarial examples and have unstable gradients which hinders interpretability. However, existing methods to solve these issues, such as adversarial training, are expensive and often sacrifice predictive accuracy. In this work, we consider curvature, which is a mathematical quantity which encodes the degree of non-linearity. Using this, we demonstrate low-curvature neural networks (LCNNs) that obtain drastically lower curvature than standard models while exhibiting similar predictive performance, which leads to improved robustness and stable gradients, with only a marginally increased training time. To achieve this, we minimize a data-independent upper bound on the curvature of a neural network, which decomposes overall curvature in terms of curvatures and slopes of its constituent layers. To…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Imaging and Analysis · Advanced Neural Network Applications · Adversarial Robustness in Machine Learning
MethodsBatch Normalization
