Variational Bayesian Dropout with a Hierarchical Prior
Yuhang Liu, Wenyong Dong, Lei Zhang, Dong Gong, Qinfeng Shi

TL;DR
This paper introduces a hierarchical Bayesian approach to variational dropout, addressing the ill-posed inference problem caused by improper priors and improving regularization and network compression performance.
Contribution
It proposes a hierarchical prior for variational dropout, enabling well-posed inference and better regularization compared to traditional methods.
Findings
Hierarchical prior improves regularization performance.
VBD outperforms existing dropout methods in experiments.
Effective for network compression tasks.
Abstract
Variational dropout (VD) is a generalization of Gaussian dropout, which aims at inferring the posterior of network weights based on a log-uniform prior on them to learn these weights as well as dropout rate simultaneously. The log-uniform prior not only interprets the regularization capacity of Gaussian dropout in network training, but also underpins the inference of such posterior. However, the log-uniform prior is an improper prior (i.e., its integral is infinite) which causes the inference of posterior to be ill-posed, thus restricting the regularization performance of VD. To address this problem, we present a new generalization of Gaussian dropout, termed variational Bayesian dropout (VBD), which turns to exploit a hierarchical prior on the network weights and infer a new joint posterior. Specifically, we implement the hierarchical prior as a zero-mean Gaussian distribution with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Machine Learning and Data Classification · Adversarial Robustness in Machine Learning
MethodsDropout
