Mixtures of Sparse Autoregressive Networks
Marc Goessling, Yali Amit

TL;DR
This paper introduces a scalable, high-dimensional distribution estimation method using sparse autoregressive networks with mixture components, achieving state-of-the-art results through regularization and parameter sharing.
Contribution
It presents a novel combination of sparsity, mixture modeling, and parameter sharing in autoregressive networks, enabling fast training and superior performance.
Findings
Achieves state-of-the-art results on benchmark datasets
Scales efficiently to very high dimensions
Provides exact likelihood evaluation with a simple distributed representation
Abstract
We consider high-dimensional distribution estimation through autoregressive networks. By combining the concepts of sparsity, mixtures and parameter sharing we obtain a simple model which is fast to train and which achieves state-of-the-art or better results on several standard benchmark datasets. Specifically, we use an L1-penalty to regularize the conditional distributions and introduce a procedure for automatic parameter sharing between mixture components. Moreover, we propose a simple distributed representation which permits exact likelihood evaluations since the latent variables are interleaved with the observable variables and can be easily integrated out. Our model achieves excellent generalization performance and scales well to extremely high dimensions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Domain Adaptation and Few-Shot Learning · Bayesian Methods and Mixture Models
