# Improved Training of Mixture-of-Experts Language GANs

**Authors:** Yekun Chai, Qiyue Yin, Junge Zhang

arXiv: 2302.11875 · 2023-02-24

## TL;DR

This paper introduces an improved training method for language GANs using a mixture-of-experts model and feature statistics alignment, leading to better text generation quality.

## Contribution

It empirically demonstrates that mixture-of-experts enhances generator capacity and employs feature statistics alignment for more effective training in language GANs.

## Key findings

- Enhanced representation capacity with mixture-of-experts
- FSA improves training signals for generator
- Superior performance on benchmarks

## Abstract

Despite the dramatic success in image generation, Generative Adversarial Networks (GANs) still face great challenges in synthesizing sequences of discrete elements, in particular human language. The difficulty in generator training arises from the limited representation capacity and uninformative learning signals obtained from the discriminator. In this work, we (1) first empirically show that the mixture-of-experts approach is able to enhance the representation capacity of the generator for language GANs and (2) harness the Feature Statistics Alignment (FSA) paradigm to render fine-grained learning signals to advance the generator training. Specifically, FSA forces the mean statistics of the distribution of fake data to approach that of real samples as close as possible in the finite-dimensional feature space. Empirical study on synthetic and real benchmarks shows the superior performance in quantitative evaluation and demonstrates the effectiveness of our approach to adversarial text generation.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2302.11875/full.md

## Figures

25 figures with captions in the complete paper: https://tomesphere.com/paper/2302.11875/full.md

## References

31 references — full list in the complete paper: https://tomesphere.com/paper/2302.11875/full.md

---
Source: https://tomesphere.com/paper/2302.11875