MaxGain: Regularisation of Neural Networks by Constraining Activation   Magnitudes

Henry Gouk; Bernhard Pfahringer; Eibe Frank; Michael Cree

arXiv:1804.05965·stat.ML·July 3, 2018

MaxGain: Regularisation of Neural Networks by Constraining Activation Magnitudes

Henry Gouk, Bernhard Pfahringer, Eibe Frank, Michael Cree

PDF

TL;DR

MaxGain introduces a regularisation method for neural networks by constraining their maximum activation gain, which improves generalisation and reduces overfitting, demonstrated through experiments on benchmark datasets.

Contribution

This paper proposes a novel regularisation technique called MaxGain that constrains the maximum activation gain of neural networks, providing an empirical analogue to the Lipschitz constant.

Findings

01

MaxGain improves generalisation on benchmark datasets.

02

Constraining maximum gain is effective as a regulariser.

03

Performance compares favourably with existing regularisation methods.

Abstract

Effective regularisation of neural networks is essential to combat overfitting due to the large number of parameters involved. We present an empirical analogue to the Lipschitz constant of a feed-forward neural network, which we refer to as the maximum gain. We hypothesise that constraining the gain of a network will have a regularising effect, similar to how constraining the Lipschitz constant of a network has been shown to improve generalisation. A simple algorithm is provided that involves rescaling the weight matrix of each layer after each parameter update. We conduct a series of studies on common benchmark datasets, and also a novel dataset that we introduce to enable easier significance testing for experiments using convolutional networks. Performance on these datasets compares favourably with other common regularisation techniques.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.