InitialGAN: A Language GAN with Completely Random Initialization
Da Ren, Qing Li

TL;DR
InitialGAN introduces a novel language GAN trained from completely random initialization, effectively addressing exposure bias without pre-training, and outperforms traditional MLE methods in text generation quality.
Contribution
The paper presents InitialGAN, the first language GAN to outperform MLE without pre-training, using dropout sampling and fully normalized LSTM techniques.
Findings
InitialGAN surpasses MLE and other models in quality.
Proposes a new evaluation metric, Least Coverage Rate.
Demonstrates effective training from random initialization.
Abstract
Text generative models trained via Maximum Likelihood Estimation (MLE) suffer from the notorious exposure bias problem, and Generative Adversarial Networks (GANs) are shown to have potential to tackle this problem. Existing language GANs adopt estimators like REINFORCE or continuous relaxations to model word probabilities. The inherent limitations of such estimators lead current models to rely on pre-training techniques (MLE pre-training or pre-trained embeddings). Representation modeling methods which are free from those limitations, however, are seldomly explored because of their poor performance in previous attempts. Our analyses reveal that invalid sampling methods and unhealthy gradients are the main contributors to such unsatisfactory performance. In this work, we present two techniques to tackle these problems: dropout sampling and fully normalized LSTM. Based on these two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods
MethodsSigmoid Activation · Tanh Activation · Dropout · REINFORCE · Long Short-Term Memory
