Rethinking Distribution Shifts: Empirical Analysis and Inductive Modeling for Tabular Data
Tianyu Wang, Jiashuo Liu, Peng Cui, Hongseok Namkoong

TL;DR
This paper empirically analyzes distribution shifts in tabular data, revealing that $Y|X$-shifts are common and that robust algorithms often underperform due to implementation details, advocating for data-driven, inductive modeling approaches.
Contribution
It provides an empirical testbed for distribution shifts in tabular data and highlights the importance of implementation choices over theoretical robustness assumptions.
Findings
$Y|X$-shifts are most common in the testbed.
Robust algorithms do not outperform vanilla methods.
Implementation details significantly impact performance.
Abstract
Different distribution shifts require different interventions, and algorithms must be grounded in the specific shifts they address. However, methodological development for robust algorithms typically relies on structural assumptions that lack empirical validation. Advocating for an empirically grounded data-driven approach to algorithm development, we build an empirical testbed comprising natural shifts across 8 tabular datasets, 172 distribution pairs over 45 methods and 90,000 method configurations encompassing empirical risk minimization and distributionally robust optimization (DRO) methods. We find -shifts are most prevalent in our testbed, in stark contrast to the heavy focus on (covariate)-shifts in the ML literature, and that the performance of robust algorithms is no better than that of vanilla methods. To understand why, we conduct an in-depth empirical analysis of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInsurance, Mortality, Demography, Risk Management · demographic modeling and climate adaptation · Big Data Technologies and Applications
MethodsFocus
