No Subclass Left Behind: Fine-Grained Robustness in Coarse-Grained   Classification Problems

Nimit S. Sohoni; Jared A. Dunnmon; Geoffrey Angus; Albert Gu,; Christopher R\'e

arXiv:2011.12945·cs.LG·April 12, 2022·45 cites

No Subclass Left Behind: Fine-Grained Robustness in Coarse-Grained Classification Problems

Nimit S. Sohoni, Jared A. Dunnmon, Geoffrey Angus, Albert Gu,, Christopher R\'e

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces GEORGE, a method to detect and improve model robustness against hidden stratification in coarse-grained classification, without needing subclass labels, by clustering features and applying robust optimization.

Contribution

GEORGE is the first approach to measure and mitigate hidden stratification in coarse classification without subclass labels, using feature clustering and distributionally robust training.

Findings

01

Boosts worst-case subclass accuracy by up to 22 percentage points.

02

Effectively estimates subclasses via clustering in feature space.

03

Improves model robustness in safety-critical applications.

Abstract

In real-world classification tasks, each class often comprises multiple finer-grained "subclasses." As the subclass labels are frequently unavailable, models trained using only the coarser-grained class labels often exhibit highly variable performance across different subclasses. This phenomenon, known as hidden stratification, has important consequences for models deployed in safety-critical applications such as medicine. We propose GEORGE, a method to both measure and mitigate hidden stratification even when subclass labels are unknown. We first observe that unlabeled subclasses are often separable in the feature space of deep neural networks, and exploit this fact to estimate subclass labels for the training data via clustering techniques. We then use these approximate subclass labels as a form of noisy supervision in a distributionally robust optimization objective. We theoretically…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

HazyResearch/hidden-stratification
pytorchOfficial

Videos

No Subclass Left Behind: Fine-Grained Robustness in Coarse-Grained Classification Problems· slideslive

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification