Learning Strict Identity Mappings in Deep Residual Networks
Xin Yu, Zhiding Yu, Srikumar Ramalingam

TL;DR
This paper introduces epsilon-ResNet, a method that automatically discards redundant layers in deep residual networks, reducing parameters significantly with minimal performance loss across multiple visual datasets.
Contribution
Proposes epsilon-ResNet, a simple, training-based layer selection method that reduces network complexity without additional variables or extensive hyper-parameter tuning.
Findings
Achieves up to 80% reduction in parameters.
Maintains comparable performance on CIFAR-10, CIFAR-100, SVHN, ImageNet.
Requires only a few extra rectified linear units.
Abstract
A family of super deep networks, referred to as residual networks or ResNet, achieved record-beating performance in various visual tasks such as image recognition, object detection, and semantic segmentation. The ability to train very deep networks naturally pushed the researchers to use enormous resources to achieve the best performance. Consequently, in many applications super deep residual networks were employed for just a marginal improvement in performance. In this paper, we propose epsilon-ResNet that allows us to automatically discard redundant layers, which produces responses that are smaller than a threshold epsilon, with a marginal or no loss in performance. The epsilon-ResNet architecture can be achieved using a few additional rectified linear units in the original ResNet. Our method does not use any additional variables nor numerous trials like other hyper-parameter…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning
MethodsAverage Pooling · Global Average Pooling · 1x1 Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · Bottleneck Residual Block · Max Pooling · Kaiming Initialization · Residual Connection · Convolution
