Why GANs are overkill for NLP

David Alvarez-Melis; Vikas Garg; Adam Tauman Kalai

arXiv:2205.09838·cs.LG·May 23, 2022

Why GANs are overkill for NLP

David Alvarez-Melis, Vikas Garg, Adam Tauman Kalai

PDF

Open Access

TL;DR

This paper provides a theoretical explanation for why GANs are less effective for NLP tasks, showing that likelihood-based methods are fundamentally more efficient for sequential data like text.

Contribution

It introduces a novel theoretical framework demonstrating that likelihood maximization inherently minimizes distinguishability, explaining the limited success of GANs in NLP.

Findings

01

Likelihood maximization and distinguishability minimization are closely related.

02

Minimizing KL-divergence effectively reduces model distinguishability.

03

A new next-token distinguishability model enables polynomial-time reduction.

Abstract

This work offers a novel theoretical perspective on why, despite numerous attempts, adversarial approaches to generative modeling (e.g., GANs) have not been as popular for certain generation tasks, particularly sequential tasks such as Natural Language Generation, as they have in others, such as Computer Vision. In particular, on sequential data such as text, maximum-likelihood approaches are significantly more utilized than GANs. We show that, while it may seem that maximizing likelihood is inherently different than minimizing distinguishability, this distinction is largely artificial and only holds for limited models. We argue that minimizing KL-divergence (i.e., maximizing likelihood) is a more efficient approach to effectively minimizing the same distinguishability criteria that adversarial models seek to optimize. Reductions show that minimizing distinguishability can be seen as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Adversarial Robustness in Machine Learning

MethodsSoftmax