Analysis on Gradient Propagation in Batch Normalized Residual Networks

Abhishek Panigrahi; Yueru Chen; C.-C. Jay Kuo

arXiv:1812.00342·cs.LG·December 4, 2018·5 cites

Analysis on Gradient Propagation in Batch Normalized Residual Networks

Abhishek Panigrahi, Yueru Chen, C.-C. Jay Kuo

PDF

Open Access

TL;DR

This paper provides a mathematical analysis of how batch normalization influences gradient propagation in residual networks, demonstrating its role in preventing gradient vanishing or explosion during training.

Contribution

It offers a theoretical understanding of BN's effect on gradient variance in residual networks, highlighting its importance in stable training.

Findings

01

BN confines gradient variance across residual blocks

02

Prevents gradient vanishing/explosion in residual networks

03

Shows the relative importance of BN in residual branches

Abstract

We conduct mathematical analysis on the effect of batch normalization (BN) on gradient backpropogation in residual network training, which is believed to play a critical role in addressing the gradient vanishing/explosion problem, in this work. By analyzing the mean and variance behavior of the input and the gradient in the forward and backward passes through the BN and residual branches, respectively, we show that they work together to confine the gradient variance to a certain range across residual blocks in backpropagation. As a result, the gradient vanishing/explosion problem is avoided. We also show the relative importance of batch normalization w.r.t. the residual branches in residual networks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGeophysical Methods and Applications · Ultrasonics and Acoustic Wave Propagation · Rock Mechanics and Modeling

MethodsBatch Normalization