Generative Text Modeling through Short Run Inference

Bo Pang; Erik Nijkamp; Tian Han; Ying Nian Wu

arXiv:2106.02513·cs.LG·June 9, 2021

Generative Text Modeling through Short Run Inference

Bo Pang, Erik Nijkamp, Tian Han, Ying Nian Wu

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel inference method for latent variable text models using short run Langevin dynamics, which improves data modeling, prevents posterior collapse, and yields a well-structured latent space.

Contribution

It proposes a flexible, inference-only approach with short run dynamics that enhances latent space quality without needing a separate inference network.

Findings

01

Models with short run dynamics outperform strong baselines.

02

No evidence of posterior collapse in the proposed method.

03

Latent space interpolation produces coherent sentences.

Abstract

Latent variable models for text, when trained successfully, accurately model the data distribution and capture global semantic and syntactic features of sentences. The prominent approach to train such models is variational autoencoders (VAE). It is nevertheless challenging to train and often results in a trivial local optimum where the latent variable is ignored and its posterior collapses into the prior, an issue known as posterior collapse. Various techniques have been proposed to mitigate this issue. Most of them focus on improving the inference model to yield latent codes of higher quality. The present work proposes a short run dynamics for inference. It is initialized from the prior distribution of the latent variable and then runs a small number (e.g., 20) of Langevin dynamics steps guided by its posterior distribution. The major advantage of our method is that it does not require…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bpucla/sri_text
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Generative Adversarial Networks and Image Synthesis · Natural Language Processing Techniques