Understanding Why Generalized Reweighting Does Not Improve Over ERM

Runtian Zhai; Chen Dan; Zico Kolter; Pradeep Ravikumar

arXiv:2201.12293·cs.LG·February 8, 2023·6 cites

Understanding Why Generalized Reweighting Does Not Improve Over ERM

Runtian Zhai, Chen Dan, Zico Kolter, Pradeep Ravikumar

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates why generalized reweighting methods do not outperform empirical risk minimization in handling distributional shifts, revealing that under certain conditions they produce similar models to ERM and are thus ineffective for robust generalization.

Contribution

The paper provides a theoretical analysis showing that broad classes of generalized reweighting algorithms yield models similar to ERM, explaining their limited effectiveness in distributional robustness.

Findings

01

GRW algorithms produce models close to ERM in overparameterized settings

02

Adding small regularization does not significantly improve robustness

03

GRW approaches are fundamentally limited in achieving distributionally robust generalization

Abstract

Empirical risk minimization (ERM) is known in practice to be non-robust to distributional shift where the training and the test distributions are different. A suite of approaches, such as importance weighting, and variants of distributionally robust optimization (DRO), have been proposed to solve this problem. But a line of recent work has empirically shown that these approaches do not significantly improve over ERM in real applications with distribution shift. The goal of this work is to obtain a comprehensive theoretical understanding of this intriguing phenomenon. We first posit the class of Generalized Reweighting (GRW) algorithms, as a broad category of approaches that iteratively update model parameters based on iterative reweighting of the training samples. We show that when overparameterized models are trained under GRW, the resulting models are close to that obtained by ERM. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

runtianz/grw-vs-erm
pytorchOfficial

Videos

Understanding Why Generalized Reweighting Does Not Improve Over ERM· slideslive

Taxonomy

TopicsMachine Learning and Algorithms · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning