Improved Variational Autoencoders for Text Modeling using Dilated Convolutions
Zichao Yang, Zhiting Hu, Ruslan Salakhutdinov, Taylor Berg-Kirkpatrick

TL;DR
This paper introduces dilated CNN decoders for variational autoencoders in text modeling, overcoming previous limitations and achieving better perplexity and labeling performance than LSTM-based VAEs.
Contribution
It proposes a novel dilated CNN decoder architecture for VAEs, enabling effective text generation and semi-supervised learning, with empirical performance improvements.
Findings
Dilated CNN decoders improve VAE text modeling performance.
Proper decoder architecture balances context and encoding information.
VAE with dilated CNN outperforms LSTM-based VAEs on perplexity and labeling tasks.
Abstract
Recent work on generative modeling of text has found that variational auto-encoders (VAE) incorporating LSTM decoders perform worse than simpler LSTM language models (Bowman et al., 2015). This negative result is so far poorly understood, but has been attributed to the propensity of LSTM decoders to ignore conditioning information from the encoder. In this paper, we experiment with a new type of decoder for VAE: a dilated CNN. By changing the decoder's dilation architecture, we control the effective context from previously generated words. In experiments, we find that there is a trade off between the contextual capacity of the decoder and the amount of encoding information used. We show that with the right decoder, VAE can outperform LSTM language models. We demonstrate perplexity gains on two datasets, representing the first positive experimental result on the use VAE for generative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Generative Adversarial Networks and Image Synthesis · Speech Recognition and Synthesis
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · USD Coin Customer Service Number +1-833-534-1729
