To Fold or Not to Fold: a Necessary and Sufficient Condition on Batch-Normalization Layers Folding
Edouard Yvinec, Arnaud Dapogny, Kevin Bailly

TL;DR
This paper establishes a necessary and sufficient condition for optimal batch-normalization folding in neural networks, providing an algorithm that significantly reduces inference time by more effectively removing BN layers.
Contribution
It introduces a new theoretical condition for BN folding and an optimal algorithm that outperforms existing methods in reducing inference time.
Findings
The proposed method outperforms existing BN folding approaches.
It significantly reduces inference time in deep neural networks.
The approach systematically identifies the maximum possible BN layer removals.
Abstract
Batch-Normalization (BN) layers have become fundamental components in the evermore complex deep neural network architectures. Such models require acceleration processes for deployment on edge devices. However, BN layers add computation bottlenecks due to the sequential operation processing: thus, a key, yet often overlooked component of the acceleration process is BN layers folding. In this paper, we demonstrate that the current BN folding approaches are suboptimal in terms of how many layers can be removed. We therefore provide a necessary and sufficient condition for BN folding and a corresponding optimal algorithm. The proposed approach systematically outperforms existing baselines and allows to dramatically reduce the inference time of deep neural networks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Advanced Neural Network Applications · Advanced Memory and Neural Computing
