Complexity-Aware Training of Deep Neural Networks for Optimal Structure Discovery
Valentin Frank Ingmar Guenter, Athanasios Sideris

TL;DR
This paper introduces a training-time pruning algorithm for deep neural networks that balances accuracy and complexity, automatically discovering optimal network structures without pre-training.
Contribution
The authors propose a novel stochastic optimization method for combined unit and layer pruning during training, with an interpretable parameter set and theoretical convergence guarantees.
Findings
Improved pruning ratios and test accuracy on CIFAR-10/100 and ImageNet.
Outperforms layer-only or unit-only pruning methods.
Competitively matches pre-trained combined pruning algorithms.
Abstract
We propose a novel algorithm for combined unit and layer pruning of deep neural networks that functions during training and without requiring a pre-trained network to apply. Our algorithm optimally trades-off learning accuracy and pruning levels while balancing layer vs. unit pruning and computational vs. parameter complexity using only three user-defined parameters, which are easy to interpret and tune. We formulate a stochastic optimization problem over the network weights and the parameters of variational Bernoulli distributions for binary Random Variables taking values either 0 or 1 and scaling the units and layers of the network. Optimal network structures are found as the solution to this optimization problem. Pruning occurs when a variational parameter converges to 0 rendering the corresponding structure permanently inactive, thus saving computations both during training and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIndustrial Vision Systems and Defect Detection · Infrastructure Maintenance and Monitoring · Fault Detection and Control Systems
MethodsAverage Pooling · Pruning · Max Pooling · Kaiming Initialization · Global Average Pooling · Convolution
