Efficiently Training Low-Curvature Neural Networks

Suraj Srinivas; Kyle Matoba; Himabindu Lakkaraju; Francois Fleuret

arXiv:2206.07144·cs.LG·January 11, 2023

Efficiently Training Low-Curvature Neural Networks

Suraj Srinivas, Kyle Matoba, Himabindu Lakkaraju, Francois Fleuret

PDF

Open Access 2 Repos

TL;DR

This paper introduces low-curvature neural networks (LCNNs) that achieve enhanced robustness and gradient stability by minimizing a curvature bound, using novel architectural components, without sacrificing predictive accuracy.

Contribution

The paper proposes a method to train low-curvature neural networks by minimizing a curvature bound and introduces new layers like centered-softplus and Lipschitz batch normalization.

Findings

01

LCNNs exhibit significantly lower curvature than standard models.

02

LCNNs demonstrate increased adversarial robustness.

03

LCNNs maintain comparable predictive performance to traditional networks.

Abstract

The highly non-linear nature of deep neural networks causes them to be susceptible to adversarial examples and have unstable gradients which hinders interpretability. However, existing methods to solve these issues, such as adversarial training, are expensive and often sacrifice predictive accuracy. In this work, we consider curvature, which is a mathematical quantity which encodes the degree of non-linearity. Using this, we demonstrate low-curvature neural networks (LCNNs) that obtain drastically lower curvature than standard models while exhibiting similar predictive performance, which leads to improved robustness and stable gradients, with only a marginally increased training time. To achieve this, we minimize a data-independent upper bound on the curvature of a neural network, which decomposes overall curvature in terms of curvatures and slopes of its constituent layers. To…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMedical Imaging and Analysis · Advanced Neural Network Applications · Adversarial Robustness in Machine Learning

MethodsBatch Normalization