# Data augmentation for low resource sentiment analysis using generative   adversarial networks

**Authors:** Rahul Gupta

arXiv: 1902.06818 · 2019-02-20

## TL;DR

This paper explores using GANs to generate synthetic data for low-resource sentiment analysis, aiming to improve classifier performance when real data is scarce.

## Contribution

It introduces a method for training GANs on limited sentiment data and evaluates the quality of generated data for data augmentation.

## Key findings

- Generated data improves sentiment classifier accuracy.
- GAN training techniques are effective with limited data.
- Visual analysis shows similarity between real and generated data.

## Abstract

Sentiment analysis is a task that may suffer from a lack of data in certain cases, as the datasets are often generated and annotated by humans. In cases where data is inadequate for training discriminative models, generate models may aid training via data augmentation. Generative Adversarial Networks (GANs) are one such model that has advanced the state of the art in several tasks, including as image and text generation. In this paper, I train GAN models on low resource datasets, then use them for the purpose of data augmentation towards improving sentiment classifier generalization. Given the constraints of limited data, I explore various techniques to train the GAN models. I also present an analysis of the quality of generated GAN data as more training data for the GAN is made available. In this analysis, the generated data is evaluated as a test set (against a model trained on real data points) as well as a training set to train classification models. Finally, I also conduct a visual analysis by projecting the generated and the real data into a two-dimensional space using the t-Distributed Stochastic Neighbor Embedding (t-SNE) method.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.06818/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1902.06818/full.md

## References

29 references — full list in the complete paper: https://tomesphere.com/paper/1902.06818/full.md

---
Source: https://tomesphere.com/paper/1902.06818