A Semi-Supervised Text Generation Framework Combining a Deep Transformer and a GAN
Shengquan Wang

TL;DR
This paper presents a semi-supervised text generation framework that combines a deep Transformer with a GAN, leveraging Gumbel-Softmax for discrete token handling and improving text synthesis through data augmentation.
Contribution
It introduces a novel semi-supervised approach integrating a Transformer and GAN with Gumbel-Softmax, along with theoretical analysis of the min-max objective.
Findings
Enhanced text generation quality with semi-supervised learning
Effective use of GAN-generated samples for data augmentation
Theoretical validation of the min-max objective function
Abstract
This paper introduces a framework that connects a deep generative pre-trained Transformer language model with a generative adversarial network for semi-supervised text generation. In other words, the proposed model is first pre-trained unsupervised on a large and diverse text corpus with 24 layers. Then a simple GAN architecture for synthetic text generation is introduced, and Gumbel-Softmax is applied to handle the discreteness of tokens. The paper also shows a semi-supervised approach where real data is augmented with GAN samples, which is further used to fine-tune the Transformer model on the merged dataset. Detailed theoretical derivations are also included, outlining the proof of the min-max objective function, and an extensive discussion of the Gumbel-Softmax reparameterization trick.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
