Preprint: Norm Loss: An efficient yet effective regularization method for deep neural networks
Theodoros Georgiou, Sebastian Schmitt, Thomas B\"ack, Wei Chen,, Michael Lew

TL;DR
This paper introduces Norm Loss, a regularization technique for deep neural networks that encourages weights to have unit norm, improving training stability and performance with minimal computational cost.
Contribution
It proposes a novel weight regularization method based on the Oblique manifold, demonstrating its effectiveness and robustness across popular datasets and architectures.
Findings
Competitive performance on CIFAR-10, CIFAR-100, and ImageNet.
Less sensitivity to hyperparameter variations.
Minimal additional computational overhead.
Abstract
Convolutional neural network training can suffer from diverse issues like exploding or vanishing gradients, scaling-based weight space symmetry and covariant-shift. In order to address these issues, researchers develop weight regularization methods and activation normalization methods. In this work we propose a weight soft-regularization method based on the Oblique manifold. The proposed method uses a loss function which pushes each weight vector to have a norm close to one, i.e. the weight matrix is smoothly steered toward the so-called Oblique manifold. We evaluate our method on the very popular CIFAR-10, CIFAR-100 and ImageNet 2012 datasets using two state-of-the-art architectures, namely the ResNet and wide-ResNet. Our method introduces negligible computational overhead and the results show that it is competitive to the state-of-the-art and in some cases superior to it.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Medical Imaging and Analysis
MethodsAverage Pooling · Residual Connection · Global Average Pooling · Batch Normalization · 1x1 Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · Activation Normalization · Bottleneck Residual Block · Convolution · Max Pooling
