The Effects of Regularization and Data Augmentation are Class Dependent
Randall Balestriero, Leon Bottou, Yann LeCun

TL;DR
This paper reveals that common regularization techniques like data augmentation and weight decay can unfairly reduce model performance on certain classes, highlighting the need for class-aware regularization methods.
Contribution
The study demonstrates that regularization methods can cause class-dependent performance drops, challenging the assumption that they uniformly improve model generalization.
Findings
Data augmentation can significantly decrease accuracy on specific classes.
Regularization techniques may introduce class bias in model performance.
Performance drops are observed even with uninformative regularizers like weight decay.
Abstract
Regularization is a fundamental technique to prevent over-fitting and to improve generalization performances by constraining a model's complexity. Current Deep Networks heavily rely on regularizers such as Data-Augmentation (DA) or weight-decay, and employ structural risk minimization, i.e. cross-validation, to select the optimal regularization hyper-parameters. In this study, we demonstrate that techniques such as DA or weight decay produce a model with a reduced complexity that is unfair across classes. The optimal amount of DA or weight decay found from cross-validation leads to disastrous model performances on some classes e.g. on Imagenet with a resnet50, the "barn spider" classification test accuracy falls from to only by introducing random crop DA during training. Even more surprising, such performance drop also appears when introducing uninformative regularization…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications
MethodsWeight Decay
