Minimal Random Code Learning with Mean-KL Parameterization
Jihao Andreas Lin, Gergely Flamich, Jos\'e Miguel Hern\'andez-Lobato

TL;DR
This paper introduces a novel Mean-KL parameterization for variational Bayesian neural network compression, leading to faster convergence, improved robustness, and more meaningful distributions compared to traditional methods.
Contribution
It proposes a Mean-KL parameterization for MIRACLE that simplifies training and enhances robustness and interpretability of compressed neural network weights.
Findings
Faster convergence in variational training.
More robust and meaningful weight distributions.
Improved compression performance with heavier tails.
Abstract
This paper studies the qualitative behavior and robustness of two variants of Minimal Random Code Learning (MIRACLE) used to compress variational Bayesian neural networks. MIRACLE implements a powerful, conditionally Gaussian variational approximation for the weight posterior and uses relative entropy coding to compress a weight sample from the posterior using a Gaussian coding distribution . To achieve the desired compression rate, must be constrained, which requires a computationally expensive annealing procedure under the conventional mean-variance (Mean-Var) parameterization for . Instead, we parameterize by its mean and KL divergence from to constrain the compression cost to the desired value by construction. We demonstrate that variational…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Generative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning
