Adaptive Variational Bayesian Inference for Sparse Deep Neural Network
Jincheng Bai, Qifan Song, Guang Cheng

TL;DR
This paper develops an adaptive variational Bayesian inference method for sparse deep neural networks that automatically selects network structure, achieving near-optimal contraction rates even in high-dimensional settings.
Contribution
It introduces an adaptive variational inference procedure that automatically chooses network structure, removing the need for prior knowledge of optimal complexity.
Findings
Achieves near-optimal contraction rates for smooth functions.
Automatically adapts to unknown network structures.
Logarithmic dependence on input dimension in sparse DNN models.
Abstract
In this work, we focus on variational Bayesian inference on the sparse Deep Neural Network (DNN) modeled under a class of spike-and-slab priors. Given a pre-specified sparse DNN structure, the corresponding variational posterior contraction rate is characterized that reveals a trade-off between the variational error and the approximation error, which are both determined by the network structural complexity (i.e., depth, width and sparsity). However, the optimal network structure, which strikes the balance of the aforementioned trade-off and yields the best rate, is generally unknown in reality. Therefore, our work further develops an {\em adaptive} variational inference procedure that can automatically select a reasonably good (data-dependent) network structure that achieves the best contraction rate, without knowing the optimal network structure. In particular, when the true function…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Anomaly Detection Techniques and Applications · Generative Adversarial Networks and Image Synthesis
