Scale Normalization

Henry Z. Lo; Kevin Amaral; Wei Ding

arXiv:1604.07796·cs.NE·April 27, 2016

Scale Normalization

Henry Z. Lo, Kevin Amaral, Wei Ding

PDF

Open Access

TL;DR

This paper explores the importance of maintaining scale or isometry in deep neural networks beyond initialization, proposing methods that improve training speed by preserving scale during learning.

Contribution

It introduces two novel methods for maintaining isometry during training, demonstrating their effectiveness in accelerating learning.

Findings

01

Preserving scale speeds up training.

02

Isometry is crucial in early learning stages.

03

Maintaining isometry leads to faster convergence.

Abstract

One of the difficulties of training deep neural networks is caused by improper scaling between layers. Scaling issues introduce exploding / gradient problems, and have typically been addressed by careful scale-preserving initialization. We investigate the value of preserving scale, or isometry, beyond the initial weights. We propose two methods of maintaing isometry, one exact and one stochastic. Preliminary experiments show that for both determinant and scale-normalization effectively speeds up learning. Results suggest that isometry is important in the beginning of learning, and maintaining it leads to faster learning.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Domain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis