Unsupervised Data Augmentation for Consistency Training

Qizhe Xie; Zihang Dai; Eduard Hovy; Minh-Thang Luong; Quoc V. Le

arXiv:1904.12848·cs.LG·November 6, 2020·1.6k cites

Unsupervised Data Augmentation for Consistency Training

Qizhe Xie, Zihang Dai, Eduard Hovy, Minh-Thang Luong, Quoc V. Le

PDF

Open Access 5 Repos 1 Models

TL;DR

This paper introduces a semi-supervised learning method that leverages advanced data augmentation techniques for consistency training, significantly improving performance across various language and vision tasks with limited labeled data.

Contribution

The paper demonstrates that using sophisticated data augmentation methods in consistency training enhances semi-supervised learning effectiveness across multiple domains.

Findings

01

Achieves 4.20% error on IMDb with only 20 labeled examples.

02

Outperforms previous methods on CIFAR-10 with only 250 labeled examples.

03

Improves ImageNet results with limited labeled data.

Abstract

Semi-supervised learning lately has shown much promise in improving deep learning models when labeled data is scarce. Common among recent approaches is the use of consistency training on a large amount of unlabeled data to constrain model predictions to be invariant to input noise. In this work, we present a new perspective on how to effectively noise unlabeled examples and argue that the quality of noising, specifically those produced by advanced data augmentation methods, plays a crucial role in semi-supervised learning. By substituting simple noising operations with advanced data augmentation methods such as RandAugment and back-translation, our method brings substantial improvements across six language and three vision tasks under the same consistency training framework. On the IMDb text classification dataset, with only 20 labeled examples, our method achieves an error rate of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

🤗
keras-io/randaugment
model· 2 dl
2 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Topic Modeling

MethodsLinear Layer · Weight Decay · Average Pooling · Adam · Layer Normalization · Softmax · Attention Is All You Need · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention