Till the Layers Collapse: Compressing a Deep Neural Network through the Lenses of Batch Normalization Layers
Zhu Liao, Nour Hezbri, Victor Qu\'etu, Van-Tam Nguyen, Enzo, Tartaglione

TL;DR
This paper introduces TLC, a method that compresses deep neural networks by reducing their depth through batch normalization layers, leading to decreased computational costs and latency in models like Swin-T, MobileNet-V2, and RoBERTa.
Contribution
The paper presents a novel depth reduction technique called TLC that leverages batch normalization layers to effectively compress neural networks.
Findings
Reduces model depth and computational requirements
Validates on models like Swin-T, MobileNet-V2, and RoBERTa
Achieves lower latency in image and NLP tasks
Abstract
Today, deep neural networks are widely used since they can handle a variety of complex tasks. Their generality makes them very powerful tools in modern technology. However, deep neural networks are often overparameterized. The usage of these large models consumes a lot of computation resources. In this paper, we introduce a method called \textbf{T}ill the \textbf{L}ayers \textbf{C}ollapse (TLC), which compresses deep neural networks through the lenses of batch normalization layers. By reducing the depth of these networks, our method decreases deep neural networks' computational requirements and overall latency. We validate our method on popular models such as Swin-T, MobileNet-V2, and RoBERTa, across both image classification and natural language processing (NLP) tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNeural Networks and Applications
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Dropout · Multi-Head Attention · Residual Connection · Adam · Layer Normalization · Weight Decay · Softmax
