# Mean Spectral Normalization of Deep Neural Networks for Embedded   Automation

**Authors:** Anand Krishnamoorthy Subramanian, Nak Young Chong

arXiv: 1907.04003 · 2019-07-10

## TL;DR

This paper introduces Mean Spectral Normalization (MSN), a novel regularization technique for Deep Neural Networks that improves training stability and performance, especially in embedded automation applications, by addressing spectral normalization limitations.

## Contribution

We propose MSN, a weight reparameterization that mitigates mean drift in spectral normalization, enhancing DNN performance and efficiency across various architectures and tasks.

## Key findings

- MSN improves training speed by ~16% over Batch Normalization.
- MSN reduces the number of trainable parameters in models.
- MSN performs well across small to large CNNs and GANs for image generation.

## Abstract

Deep Neural Networks (DNNs) have begun to thrive in the field of automation systems, owing to the recent advancements in standardising various aspects such as architecture, optimization techniques, and regularization. In this paper, we take a step towards a better understanding of Spectral Normalization (SN) and its potential for standardizing regularization of a wider range of Deep Learning models, following an empirical approach. We conduct several experiments to study their training dynamics, in comparison with the ubiquitous Batch Normalization (BN) and show that SN increases the gradient sparsity and controls the gradient variance. Furthermore, we show that SN suffers from a phenomenon, we call the mean-drift effect, which mitigates its performance. We, then, propose a weight reparameterization called as the Mean Spectral Normalization (MSN) to resolve the mean drift, thereby significantly improving the network's performance. Our model performs ~16% faster as compared to BN in practice, and has fewer trainable parameters. We also show the performance of our MSN for small, medium, and large CNNs - 3-layer CNN, VGG7 and DenseNet-BC, respectively - and unsupervised image generation tasks using Generative Adversarial Networks (GANs) to evaluate its applicability for a broad range of embedded automation tasks.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.04003/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1907.04003/full.md

## References

31 references — full list in the complete paper: https://tomesphere.com/paper/1907.04003/full.md

---
Source: https://tomesphere.com/paper/1907.04003