Mitigating severe over-parameterization in deep convolutional neural   networks through forced feature abstraction and compression with an   entropy-based heuristic

Nidhi Gowdra; Roopak Sinha; Stephen MacDonell; Wei Qi Yan

arXiv:2106.14190·cs.CV·June 29, 2021

Mitigating severe over-parameterization in deep convolutional neural networks through forced feature abstraction and compression with an entropy-based heuristic

Nidhi Gowdra, Roopak Sinha, Stephen MacDonell, Wei Qi Yan

PDF

TL;DR

This paper introduces an entropy-based heuristic to limit CNN depth, reducing over-parameterization and training time while maintaining or improving accuracy across multiple datasets and architectures.

Contribution

The proposed EBCLE heuristic effectively constrains CNN depth based on data entropy, enhancing resource utilization and model efficiency without performance loss.

Findings

01

Reduces training time by up to 78.59%

02

Maintains or improves classification accuracy

03

Effective across multiple datasets and architectures

Abstract

Convolutional Neural Networks (CNNs) such as ResNet-50, DenseNet-40 and ResNeXt-56 are severely over-parameterized, necessitating a consequent increase in the computational resources required for model training which scales exponentially for increments in model depth. In this paper, we propose an Entropy-Based Convolutional Layer Estimation (EBCLE) heuristic which is robust and simple, yet effective in resolving the problem of over-parameterization with regards to network depth of CNN model. The EBCLE heuristic employs a priori knowledge of the entropic data distribution of input datasets to determine an upper bound for convolutional network depth, beyond which identity transformations are prevalent offering insignificant contributions for enhancing model performance. Restricting depth redundancies by forcing feature compression and abstraction restricts over-parameterization while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPointwise Convolution · Depthwise Convolution · Residual Connection · Grouped Convolution · Global Average Pooling · Sigmoid Activation · Depthwise Separable Convolution · Bottleneck Residual Block · Residual Block · Kaiming Initialization