InitialGAN: A Language GAN with Completely Random Initialization

Da Ren; Qing Li

arXiv:2208.02531·cs.CL·July 19, 2023

InitialGAN: A Language GAN with Completely Random Initialization

Da Ren, Qing Li

PDF

Open Access

TL;DR

InitialGAN introduces a novel language GAN trained from completely random initialization, effectively addressing exposure bias without pre-training, and outperforms traditional MLE methods in text generation quality.

Contribution

The paper presents InitialGAN, the first language GAN to outperform MLE without pre-training, using dropout sampling and fully normalized LSTM techniques.

Findings

01

InitialGAN surpasses MLE and other models in quality.

02

Proposes a new evaluation metric, Least Coverage Rate.

03

Demonstrates effective training from random initialization.

Abstract

Text generative models trained via Maximum Likelihood Estimation (MLE) suffer from the notorious exposure bias problem, and Generative Adversarial Networks (GANs) are shown to have potential to tackle this problem. Existing language GANs adopt estimators like REINFORCE or continuous relaxations to model word probabilities. The inherent limitations of such estimators lead current models to rely on pre-training techniques (MLE pre-training or pre-trained embeddings). Representation modeling methods which are free from those limitations, however, are seldomly explored because of their poor performance in previous attempts. Our analyses reveal that invalid sampling methods and unhealthy gradients are the main contributors to such unsatisfactory performance. In this work, we present two techniques to tackle these problems: dropout sampling and fully normalized LSTM. Based on these two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods

MethodsSigmoid Activation · Tanh Activation · Dropout · REINFORCE · Long Short-Term Memory