Slope and generalization properties of neural networks

Anton Johansson; Niklas Engsner; Claes Stranneg{\aa}rd; Petter Mostad

arXiv:2107.01473·stat.ML·July 6, 2021

Slope and generalization properties of neural networks

Anton Johansson, Niklas Engsner, Claes Stranneg{\aa}rd, Petter Mostad

PDF

Open Access 1 Repo

TL;DR

This paper introduces the concept of controlling the slope of neural networks to improve generalization, providing theoretical insights and empirical evidence that slope distribution is architecture-independent and smoothly varying.

Contribution

It proposes the slope as a measure to control neural network complexity, with theoretical properties and empirical validation across different architectures.

Findings

01

Slope distribution is independent of layer width in trained networks.

02

Mean slope has weak dependence on architecture.

03

Slope varies smoothly and aligns with theoretical predictions.

Abstract

Neural networks are very successful tools in for example advanced classification. From a statistical point of view, fitting a neural network may be seen as a kind of regression, where we seek a function from the input space to a space of classification probabilities that follows the "general" shape of the data, but avoids overfitting by avoiding memorization of individual data points. In statistics, this can be done by controlling the geometric complexity of the regression function. We propose to do something similar when fitting neural networks by controlling the slope of the network. After defining the slope and discussing some of its theoretical properties, we go on to show empirically in examples, using ReLU networks, that the distribution of the slope of a well-trained neural network classifier is generally independent of the width of the layers in a fully connected network, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

antonFJohansson/slope_and_generalization
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications