Benchmarking Distribution Shift in Tabular Data with TableShift

Josh Gardner; Zoran Popovic; Ludwig Schmidt

arXiv:2312.07577·cs.LG·February 12, 2024·5 cites

Benchmarking Distribution Shift in Tabular Data with TableShift

Josh Gardner, Zoran Popovic, Ludwig Schmidt

PDF

Open Access 1 Repo 1 Video

TL;DR

TableShift is a new benchmark for evaluating the robustness of tabular data models against distribution shifts across diverse real-world domains, enabling systematic assessment and comparison of model performance under such shifts.

Contribution

The paper introduces TableShift, the first comprehensive benchmark for distribution shift in tabular data, including diverse tasks, data sources, and shifts, along with a large-scale evaluation of models.

Findings

01

Domain robustness methods can reduce shift gaps but lower in-distribution accuracy.

02

A linear relationship exists between in-distribution and out-of-distribution accuracy.

03

Shift gaps correlate with changes in label distribution.

Abstract

Robustness to distribution shift has become a growing concern for text and image models as they transition from research subjects to deployment in the real world. However, high-quality benchmarks for distribution shift in tabular machine learning tasks are still lacking despite the widespread real-world use of tabular data and differences in the models used for tabular data in comparison to text and images. As a consequence, the robustness of tabular models to distribution shift is poorly understood. To address this issue, we introduce TableShift, a distribution shift benchmark for tabular data. TableShift contains 15 binary classification tasks in total, each with an associated shift, and includes a diverse set of data sources, prediction targets, and distribution shifts. The benchmark covers domains including finance, education, public policy, healthcare, and civic participation, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mlfoundations/tableshift
pytorchOfficial

Videos

Benchmarking Distribution Shift in Tabular Data with TableShift· slideslive

Taxonomy

TopicsMachine Learning in Healthcare · Artificial Intelligence in Healthcare

MethodsSparse Evolutionary Training