OptAGAN: Entropy-based finetuning on text VAE-GAN

Paolo Tirotta; Stefano Lodi

arXiv:2109.00239·cs.CL·January 5, 2022

OptAGAN: Entropy-based finetuning on text VAE-GAN

Paolo Tirotta, Stefano Lodi

PDF

Open Access 1 Repo

TL;DR

This paper introduces OptAGAN, a novel approach combining VAE-GANs with entropy-based reinforcement learning to improve text generation quality and diversity, leveraging pre-trained models BERT and GPT-2.

Contribution

It presents a new method for fine-tuning text VAE-GANs with entropy-based RL, enhancing text quality and diversity beyond existing models.

Findings

01

Significant improvement in text quality over state-of-the-art methods.

02

Enhanced diversity in generated texts due to entropy-based rewards.

03

Effective modeling of high-level sentence features and low-level word generation.

Abstract

Transfer learning through large pre-trained models has changed the landscape of current applications in natural language processing (NLP). Recently Optimus, a variational autoencoder (VAE) which combines two pre-trained models, BERT and GPT-2, has been released, and its combination with generative adversial networks (GANs) has been shown to produce novel, yet very human-looking text. The Optimus and GANs combination avoids the troublesome application of GANs to the discrete domain of text, and prevents the exposure bias of standard maximum likelihood methods. We combine the training of GANs in the latent space, with the finetuning of the decoder of Optimus for single word generation. This approach lets us model both the high-level features of the sentences, and the low-level word-by-word generation. We finetune using reinforcement learning (RL) by exploiting the structure of GPT-2 and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

egojr/optagan
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsAttention Is All You Need · Linear Layer · Cosine Annealing · Attention Dropout · Multi-Head Attention · Linear Warmup With Linear Decay · Linear Warmup With Cosine Annealing · WordPiece · Dense Connections · Discriminative Fine-Tuning