Reweighted Wake-Sleep
J\"org Bornschein, Yoshua Bengio

TL;DR
This paper introduces a reweighted wake-sleep algorithm that improves training of deep generative models by using importance sampling with multiple latent variable samples, leading to better likelihood estimates.
Contribution
It offers a novel interpretation of the wake-sleep algorithm through importance sampling and demonstrates that more powerful inference networks improve model performance.
Findings
Reweighted wake-sleep achieves higher likelihood estimates.
Using NADE as inference network outperforms sigmoidal belief networks.
Multiple latent samples enhance gradient estimation.
Abstract
Training deep directed graphical models with many hidden variables and performing inference remains a major challenge. Helmholtz machines and deep belief networks are such models, and the wake-sleep algorithm has been proposed to train them. The wake-sleep algorithm relies on training not just the directed generative model but also a conditional generative model (the inference network) that runs backward from visible to latent, estimating the posterior distribution of latent given visible. We propose a novel interpretation of the wake-sleep algorithm which suggests that better estimators of the gradient can be obtained by sampling latent variables multiple times from the inference network. This view is based on importance sampling as an estimator of the likelihood, with the approximate inference network as a proposal distribution. This interpretation is confirmed experimentally, showing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Topic Modeling · Model Reduction and Neural Networks
