AutoDO: Robust AutoAugment for Biased Data with Label Noise via Scalable   Probabilistic Implicit Differentiation

Denis Gudovskiy; Luca Rigazio; Shun Ishizaka; Kazuki Kozuka; Sotaro; Tsukizawa

arXiv:2103.05863·cs.CV·March 15, 2021

AutoDO: Robust AutoAugment for Biased Data with Label Noise via Scalable Probabilistic Implicit Differentiation

Denis Gudovskiy, Luca Rigazio, Shun Ishizaka, Kazuki Kozuka, Sotaro, Tsukizawa

PDF

1 Repo

TL;DR

AutoDO introduces a robust automated dataset optimization framework that improves deep learning model generalization on biased and noisy data by explicitly estimating distribution-changing hyperparameters through scalable implicit differentiation.

Contribution

It reformulates AutoAugment as a dataset optimization task with explicit hyperparameter estimation, enhancing robustness to bias and label noise.

Findings

01

Up to 9.3% accuracy improvement on biased datasets.

02

Up to 36.6% gain for underrepresented classes.

03

Scales linearly with dataset size using Fisher information.

Abstract

AutoAugment has sparked an interest in automated augmentation methods for deep learning models. These methods estimate image transformation policies for train data that improve generalization to test data. While recent papers evolved in the direction of decreasing policy search complexity, we show that those methods are not robust when applied to biased and noisy data. To overcome these limitations, we reformulate AutoAugment as a generalized automated dataset optimization (AutoDO) task that minimizes the distribution shift between test data and distorted train dataset. In our AutoDO model, we explicitly estimate a set of per-point hyperparameters to flexibly change distribution of train data. In particular, we include hyperparameters for augmentation, loss weights, and soft-labels that are jointly estimated using implicit differentiation. We develop a theoretical probabilistic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gudovskiy/autodo
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory · AutoAugment