Improved Group Robustness via Classifier Retraining on Independent   Splits

Thien Hang Nguyen; Hongyang R. Zhang; Huy Le Nguyen

arXiv:2204.09583·cs.LG·August 1, 2023·1 cites

Improved Group Robustness via Classifier Retraining on Independent Splits

Thien Hang Nguyen, Hongyang R. Zhang, Huy Le Nguyen

PDF

Open Access 1 Repo

TL;DR

This paper introduces a simple classifier retraining method on independent data splits that improves worst-group performance in deep neural networks, outperforming existing methods like group DRO and JTT on benchmark tasks.

Contribution

The paper proposes a novel sample-splitting and retraining approach that enhances group robustness with minimal hyperparameters and theoretical justification.

Findings

01

Consistently outperforms group DRO and JTT on benchmark datasets.

02

Requires only a single hyperparameter for tuning.

03

Theoretically justified by a generalization-bound analysis.

Abstract

Deep neural networks trained by minimizing the average risk can achieve strong average performance. Still, their performance for a subgroup may degrade if the subgroup is underrepresented in the overall data population. Group distributionally robust optimization (Sagawa et al., 2020a), or group DRO in short, is a widely used baseline for learning models with strong worst-group performance. We note that this method requires group labels for every example at training time and can overfit to small groups, requiring strong regularization. Given a limited amount of group labels at training time, Just Train Twice (Liu et al., 2021), or JTT in short, is a two-stage method that infers a pseudo group label for every unlabeled example first, then applies group DRO based on the inferred group labels. The inference process is also sensitive to overfitting, sometimes involving additional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

timmytonga/crois
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Domain Adaptation and Few-Shot Learning