Layer Adaptive Node Selection in Bayesian Neural Networks: Statistical Guarantees and Implementation Details
Sanket Jantre, Shrijita Bhattacharya, Tapabrata Maiti

TL;DR
This paper introduces a Bayesian method with spike-and-slab priors for automatic node selection in neural networks, providing statistical guarantees and improved computational efficiency over edge-based sparsity approaches.
Contribution
It develops a layer-adaptive node selection framework with theoretical guarantees, relaxing previous assumptions and demonstrating superior empirical performance.
Findings
Outperforms edge selection in computational efficiency
Achieves layer-wise optimal node recovery
Provides theoretical guarantees for variational posterior consistency
Abstract
Sparse deep neural networks have proven to be efficient for predictive model building in large-scale studies. Although several works have studied theoretical and numerical properties of sparse neural architectures, they have primarily focused on the edge selection. Sparsity through edge selection might be intuitively appealing; however, it does not necessarily reduce the structural complexity of a network. Instead pruning excessive nodes leads to a structurally sparse network with significant computational speedup during inference. To this end, we propose a Bayesian sparse solution using spike-and-slab Gaussian priors to allow for automatic node selection during training. The use of spike-and-slab prior alleviates the need of an ad-hoc thresholding rule for pruning. In addition, we adopt a variational Bayes approach to circumvent the computational challenges of traditional Markov Chain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning
MethodsPruning
