Elastic Neural Networks for Classification
Yi Zhou, Yue Bai, Shuvra S. Bhattacharyya, Heikki Huttunen

TL;DR
This paper introduces Elastic Neural Networks, a framework that inserts intermediate output branches to mitigate vanishing gradients, improve accuracy across various architectures, and allow adjustable computational complexity based on resource constraints.
Contribution
The paper proposes a novel elastic network framework with intermediate outputs to enhance training and performance of deep neural networks, addressing vanishing gradients and computational flexibility.
Findings
Improved accuracy on CIFAR10 and CIFAR100 datasets.
Effective for both shallow and deep networks like MobileNet and DenseNet.
Computational complexity can be elastically adjusted based on resource needs.
Abstract
In this work we propose a framework for improving the performance of any deep neural network that may suffer from vanishing gradients. To address the vanishing gradient issue, we study a framework, where we insert an intermediate output branch after each layer in the computational graph and use the corresponding prediction loss for feeding the gradient to the early layers. The framework - which we name Elastic network - is tested with several well-known networks on CIFAR10 and CIFAR100 datasets, and the experimental results show that the proposed framework improves the accuracy on both shallow networks (e.g., MobileNet) and deep convolutional neural networks (e.g., DenseNet). We also identify the types of networks where the framework does not improve the performance and discuss the reasons. Finally, as a side product, the computational complexity of the resulting networks can be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
