DivAug: Plug-in Automated Data Augmentation with Explicit Diversity   Maximization

Zirui Liu; Haifeng Jin; Ting-Hsiang Wang; Kaixiong Zhou; Xia Hu

arXiv:2103.14545·cs.CV·August 13, 2021·1 cites

DivAug: Plug-in Automated Data Augmentation with Explicit Diversity Maximization

Zirui Liu, Haifeng Jin, Ting-Hsiang Wang, Kaixiong Zhou, Xia Hu

PDF

Open Access 1 Repo

TL;DR

DivAug introduces an explicit diversity measure called Variance Diversity, theoretically links it to regularization benefits, and employs an unsupervised framework to enhance data augmentation, improving semi-supervised learning performance efficiently.

Contribution

Proposes Variance Diversity as a measurable and theoretically justified diversity metric, and develops DivAug, an unsupervised method to maximize it without search, boosting augmentation regularization effects.

Findings

01

Variance Diversity correlates with test accuracy gains.

02

DivAug achieves comparable performance to state-of-the-art methods.

03

Enhances semi-supervised learning with better efficiency.

Abstract

Human-designed data augmentation strategies have been replaced by automatically learned augmentation policy in the past two years. Specifically, recent work has empirically shown that the superior performance of the automated data augmentation methods stems from increasing the diversity of augmented data \cite{autoaug, randaug}. However, two factors regarding the diversity of augmented data are still missing: 1) the explicit definition (and thus measurement) of diversity and 2) the quantifiable relationship between diversity and its regularization effects. To bridge this gap, we propose a diversity measure called Variance Diversity and theoretically show that the regularization effect of data augmentation is promised by Variance Diversity. We validate in experiments that the relative gain from automated data augmentation in test accuracy is highly correlated to Variance Diversity. An…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

warai-0toko/divaug
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Data Classification · Indoor and Outdoor Localization Technologies

MethodsRandAugment