Benchmarking Test-Time Adaptation against Distribution Shifts in Image Classification
Yongcan Yu, Lijun Sheng, Ran He, Jian Liang

TL;DR
This paper introduces a comprehensive benchmark for evaluating 13 test-time adaptation methods across multiple image classification datasets and network architectures, addressing the need for fair and consistent comparison of robustness techniques under distribution shifts.
Contribution
The paper presents a unified PyTorch framework and systematic evaluation of diverse TTA methods on various datasets and backbones, filling a gap in standardized benchmarking.
Findings
Different TTA methods vary significantly in performance.
Compatibility of TTA methods with network backbones is analyzed.
Benchmark provides a reliable comparison platform for future research.
Abstract
Test-time adaptation (TTA) is a technique aimed at enhancing the generalization performance of models by leveraging unlabeled samples solely during prediction. Given the need for robustness in neural network systems when faced with distribution shifts, numerous TTA methods have recently been proposed. However, evaluating these methods is often done under different settings, such as varying distribution shifts, backbones, and designing scenarios, leading to a lack of consistent and fair benchmarks to validate their effectiveness. To address this issue, we present a benchmark that systematically evaluates 13 prominent TTA methods and their variants on five widely used image classification datasets: CIFAR-10-C, CIFAR-100-C, ImageNet-C, DomainNet, and Office-Home. These methods encompass a wide range of adaptation scenarios (e.g. online adaptation v.s. offline adaptation, instance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Neonatal and fetal brain pathology
