Domain-Adjusted Regression or: ERM May Already Learn Features Sufficient for Out-of-Distribution Generalization
Elan Rosenfeld, Pradeep Ravikumar, Andrej Risteski

TL;DR
This paper challenges the idea that deep networks fail out-of-distribution generalization due to feature learning issues, showing that robust regression on existing features can suffice, and introduces DARE, a method for domain-adjusted prediction.
Contribution
The paper demonstrates that ERM learns sufficient features for out-of-distribution generalization and introduces DARE, a convex method for robust domain-adjusted regression with theoretical guarantees.
Findings
ERM learns sufficient features for OOD generalization.
DARE outperforms prior methods on finetuned features.
DARE has provable minimax optimality and convergence guarantees.
Abstract
A common explanation for the failure of deep networks to generalize out-of-distribution is that they fail to recover the "correct" features. We challenge this notion with a simple experiment which suggests that ERM already learns sufficient features and that the current bottleneck is not feature learning, but robust regression. Our findings also imply that given a small amount of data from the target distribution, retraining only the last linear layer will give excellent performance. We therefore argue that devising simpler methods for learning predictors on existing features is a promising direction for future research. Towards this end, we introduce Domain-Adjusted Regression (DARE), a convex objective for learning a linear predictor that is provably robust under a new model of distribution shift. Rather than learning one function, DARE performs a domain-specific adjustment to unify…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Data Classification
MethodsLinear Layer
