Discretely Relaxing Continuous Variables for tractable Variational Inference
Trefor W. Evans, Prasanth B. Nair

TL;DR
This paper introduces the DIRECT method for Bayesian variational inference with discrete latent variables, enabling exact ELBO gradient computation, efficient large-scale training, and fast inference on hardware-limited devices.
Contribution
It proposes a novel approach that exploits Kronecker algebra for exact ELBO gradients and scalable inference with discrete variables, improving over previous stochastic methods.
Findings
Exact ELBO gradient computation eliminates high-variance estimators.
Training complexity is independent of dataset size, enabling large-scale inference.
Models using 4-bit quantized integers achieve accurate results with fast training.
Abstract
We explore a new research direction in Bayesian variational inference with discrete latent variable priors where we exploit Kronecker matrix algebra for efficient and exact computations of the evidence lower bound (ELBO). The proposed "DIRECT" approach has several advantages over its predecessors; (i) it can exactly compute ELBO gradients (i.e. unbiased, zero-variance gradient estimates), eliminating the need for high-variance stochastic gradient estimators and enabling the use of quasi-Newton optimization methods; (ii) its training complexity is independent of the number of training points, permitting inference on large datasets; and (iii) its posterior samples consist of sparse and low-precision quantized integers which permit fast inference on hardware limited devices. In addition, our DIRECT models can exactly compute statistical moments of the parameterized predictive posterior…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Bayesian Methods and Mixture Models · Machine Learning and Algorithms
