Understanding data augmentation for classification: when to warp?
Sebastien C. Wong, Adam Gatt, Victor Stamatescu, Mark D. McDonnell

TL;DR
This paper compares data augmentation techniques in data-space and feature-space for image classification, finding data-space augmentation more effective when plausible transformations are known.
Contribution
It provides an empirical evaluation of data warping versus synthetic over-sampling for augmenting training data in neural networks and SVMs.
Findings
Data-space augmentation yields better performance when plausible transformations are available.
Augmentation reduces overfitting across different classifiers.
Feature-space augmentation is less effective without suitable transformations.
Abstract
In this paper we investigate the benefit of augmenting data with synthetically created samples when training a machine learning classifier. Two approaches for creating additional training samples are data warping, which generates additional samples through transformations applied in the data-space, and synthetic over-sampling, which creates additional samples in feature-space. We experimentally evaluate the benefits of data augmentation for a convolutional backpropagation-trained neural network, a convolutional support vector machine and a convolutional extreme learning machine classifier, using the standard MNIST handwritten digit dataset. We found that while it is possible to perform generic augmentation in feature-space, if plausible transforms for the data are known then augmentation in data-space provides a greater benefit for improving performance and reducing overfitting.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
