Test-Time Adaptation Induces Stronger Accuracy and Agreement-on-the-Line
Eungyeup Kim, Mingjie Sun, Christina Baek, Aditi Raghunathan, J. Zico, Kolter

TL;DR
Test-time adaptation significantly enhances the correlation between in-distribution and out-of-distribution accuracy and agreement, enabling more reliable OOD performance estimation and model selection without labeled data.
Contribution
This paper demonstrates that recent TTA methods strengthen accuracy-on-the-line and agreement-on-the-line trends, even under challenging distribution shifts, and explains the underlying theoretical conditions.
Findings
TTA improves OOD performance and strengthens ACL and AGL trends.
TTA causes data distribution to collapse into a single scaling variable in feature space.
Combining TTA with AGL allows high-precision OOD performance estimation.
Abstract
Recently, Miller et al. (2021) and Baek et al. (2022) empirically demonstrated strong linear correlations between in-distribution (ID) versus out-of-distribution (OOD) accuracy and agreement. These trends, coined accuracy-on-the-line (ACL) and agreement-on-the-line (AGL), enable OOD model selection and performance estimation without labeled data. However, these phenomena also break for certain shifts, such as CIFAR10-C Gaussian Noise, posing a critical bottleneck. In this paper, we make a key finding that recent test-time adaptation (TTA) methods not only improve OOD performance, but drastically strengthen the ACL and AGL trends in models, even in shifts where models showed very weak correlations before. To analyze this, we revisit the theoretical conditions from Miller et al. (2021) that outline the types of distribution shifts needed for perfect ACL in linear models. Surprisingly,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsCardiovascular Health and Disease Prevention · Cardiovascular Function and Risk Factors · Advanced MRI Techniques and Applications
