Stochastic Function Norm Regularization of Deep Networks
Amal Rannen Triki, Matthew B. Blaschko

TL;DR
This paper introduces a novel regularization method for deep neural networks using the $L_2$ function norm, which improves performance in small data regimes by directly controlling the network function complexity.
Contribution
It proposes two new methods to incorporate $L_2$ function norm regularization into stochastic backpropagation and analyzes their convergence, demonstrating superior results on benchmark datasets.
Findings
Outperforms state-of-the-art methods in low sample regimes
Shows significant improvement on MNIST and CIFAR10 datasets
Effective in low-dimensional manifold data scenarios
Abstract
Deep neural networks have had an enormous impact on image analysis. State-of-the-art training methods, based on weight decay and DropOut, result in impressive performance when a very large training set is available. However, they tend to have large problems overfitting to small data sets. Indeed, the available regularization methods deal with the complexity of the network function only indirectly. In this paper, we study the feasibility of directly using the function norm for regularization. Two methods to integrate this new regularization in the stochastic backpropagation are proposed. Moreover, the convergence of these new algorithms is studied. We finally show that they outperform the state-of-the-art methods in the low sample regime on benchmark datasets (MNIST and CIFAR10). The obtained results demonstrate very clear improvement, especially in the context of small sample…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Advanced Neural Network Applications · Sparse and Compressive Sensing Techniques
MethodsWeight Decay
