ODD: Overlap-aware Estimation of Model Performance under Distribution Shift

Aayush Mishra; Anqi Liu

arXiv:2506.14978·cs.LG·June 19, 2025

ODD: Overlap-aware Estimation of Model Performance under Distribution Shift

Aayush Mishra, Anqi Liu

PDF

Open Access

TL;DR

This paper introduces ODD, a new method for estimating model performance under distribution shift by focusing on non-overlapping regions, leading to more accurate and reliable error bounds.

Contribution

The paper proposes Overlap-aware Disagreement Discrepancy (ODD), a novel approach that improves error estimation by addressing overlap issues in distribution shift scenarios.

Findings

01

ODD outperforms DIS^2 in predicting target error.

02

ODD provides more reliable error bounds across benchmarks.

03

The method effectively estimates domain-overlap using domain classifiers.

Abstract

Reliable and accurate estimation of the error of an ML model in unseen test domains is an important problem for safe intelligent systems. Prior work uses disagreement discrepancy (DIS^2) to derive practical error bounds under distribution shifts. It optimizes for a maximally disagreeing classifier on the target domain to bound the error of a given source classifier. Although this approach offers a reliable and competitively accurate estimate of the target error, we identify a problem in this approach which causes the disagreement discrepancy objective to compete in the overlapping region between source and target domains. With an intuitive assumption that the target disagreement should be no more than the source disagreement in the overlapping region due to high enough support, we devise Overlap-aware Disagreement Discrepancy (ODD). Maximizing ODD only requires disagreement in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTraffic Prediction and Management Techniques