What Are Effective Labels for Augmented Data? Improving Calibration and Robustness with AutoLabel
Yao Qin, Xuezhi Wang, Balaji Lakshminarayanan, Ed H. Chi, Alex, Beutel

TL;DR
This paper introduces AutoLabel, a method that automatically learns appropriate labels for augmented data to improve neural network calibration and robustness, especially under distributional shifts.
Contribution
AutoLabel is a novel, generic approach that learns label confidence for augmented data, enhancing calibration and accuracy across multiple augmentation techniques.
Findings
AutoLabel improves calibration and accuracy on CIFAR-10, CIFAR-100, and ImageNet.
It enhances robustness under distributional shift.
AutoLabel outperforms using fixed one-hot labels for augmented data.
Abstract
A wide breadth of research has devised data augmentation approaches that can improve both accuracy and generalization performance for neural networks. However, augmented data can end up being far from the clean training data and what is the appropriate label is less clear. Despite this, most existing work simply uses one-hot labels for augmented data. In this paper, we show re-using one-hot labels for highly distorted data might run the risk of adding noise and degrading accuracy and calibration. To mitigate this, we propose a generic method AutoLabel to automatically learn the confidence in the labels for augmented data, based on the transformation distance between the clean distribution and augmented distribution. AutoLabel is built on label smoothing and is guided by the calibration-performance over a hold-out validation set. We successfully apply AutoLabel to three different data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Image Enhancement Techniques · Machine Learning and Data Classification
MethodsAugMix · Label Smoothing
