Theoretical Guarantees of Data Augmented Last Layer Retraining Methods
Monica Welfert, Nathan Stromberg, Lalitha Sankar

TL;DR
This paper provides theoretical guarantees for data-augmented last layer retraining methods, demonstrating their optimal worst-group accuracy under Gaussian assumptions and validating results on synthetic and real datasets.
Contribution
It offers the first theoretical analysis of last layer retraining with data augmentation, establishing optimal worst-group accuracy under Gaussian latent representations.
Findings
Optimal worst-group accuracy derived for Gaussian latent distributions.
Validation of theoretical results on synthetic and real datasets.
Data augmentation strategies improve fairness in model predictions.
Abstract
Ensuring fair predictions across many distinct subpopulations in the training data can be prohibitive for large models. Recently, simple linear last layer retraining strategies, in combination with data augmentation methods such as upweighting, downsampling and mixup, have been shown to achieve state-of-the-art performance for worst-group accuracy, which quantifies accuracy for the least prevalent subpopulation. For linear last layer retraining and the abovementioned augmentations, we present the optimal worst-group accuracy when modeling the distribution of the latent representations (input to the last layer) as Gaussian for each subpopulation. We evaluate and verify our results for both synthetic and large publicly available datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSimulation Techniques and Applications
