Generative Text Modeling through Short Run Inference
Bo Pang, Erik Nijkamp, Tian Han, Ying Nian Wu

TL;DR
This paper introduces a novel inference method for latent variable text models using short run Langevin dynamics, which improves data modeling, prevents posterior collapse, and yields a well-structured latent space.
Contribution
It proposes a flexible, inference-only approach with short run dynamics that enhances latent space quality without needing a separate inference network.
Findings
Models with short run dynamics outperform strong baselines.
No evidence of posterior collapse in the proposed method.
Latent space interpolation produces coherent sentences.
Abstract
Latent variable models for text, when trained successfully, accurately model the data distribution and capture global semantic and syntactic features of sentences. The prominent approach to train such models is variational autoencoders (VAE). It is nevertheless challenging to train and often results in a trivial local optimum where the latent variable is ignored and its posterior collapses into the prior, an issue known as posterior collapse. Various techniques have been proposed to mitigate this issue. Most of them focus on improving the inference model to yield latent codes of higher quality. The present work proposes a short run dynamics for inference. It is initialized from the prior distribution of the latent variable and then runs a small number (e.g., 20) of Langevin dynamics steps guided by its posterior distribution. The major advantage of our method is that it does not require…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Generative Adversarial Networks and Image Synthesis · Natural Language Processing Techniques
