Active Test-Time Adaptation: Theoretical Analyses and An Algorithm
Shurui Gui, Xiner Li, Shuiwang Ji

TL;DR
This paper introduces active test-time adaptation (ATTA), integrating active learning into test-time adaptation to improve performance under domain shifts, supported by theoretical analysis and a practical algorithm.
Contribution
It proposes the novel ATTA problem setting, provides theoretical guarantees, and develops the SimATTA algorithm for effective real-time sample selection.
Findings
ATTA improves test performance over traditional TTA methods.
SimATTA achieves substantial performance gains with efficiency.
Theoretical analysis confirms the benefits of limited labeled test instances.
Abstract
Test-time adaptation (TTA) addresses distribution shifts for streaming test data in unsupervised settings. Currently, most TTA methods can only deal with minor shifts and rely heavily on heuristic and empirical studies. To advance TTA under domain shifts, we propose the novel problem setting of active test-time adaptation (ATTA) that integrates active learning within the fully TTA setting. We provide a learning theory analysis, demonstrating that incorporating limited labeled test instances enhances overall performances across test domains with a theoretical guarantee. We also present a sample entropy balancing for implementing ATTA while avoiding catastrophic forgetting (CF). We introduce a simple yet effective ATTA algorithm, known as SimATTA, using real-time sample selection techniques. Extensive experimental results confirm consistency with our theoretical analyses and show that…
Peer Reviews
Decision·ICLR 2024 poster
**Originality**: The paper introduces a novel setting, active test-time adaptation (ATTA) with theoretical guarantees for alleviating distribution shifts and mitigating catastrophic forgetting and extensive experiments on several benchmarks under domain generalization shifts. Additionally, the Section FAQ & Discussions in supplementary material is highly praiseworthy. **Quality**: The paper provides a thorough experimental evaluation of the SimATTA algorithm on four datasets (PACS, VLCS, Office
**Insufficient visualization**: Though the authors have provided detailed algorithms (Alg. 1 and Alg. 2) to show the proposed SimATTA algorithm, it is still hard to follow the whole picture quickly. Thus it could be better to provide a clear diagram to illustrate the framework of the SimATTA algorithm. **Insufficient justifications**: For example, regarding the **efficiency** and **applicability** of ATTA, some justifications are missing in this paper. First, as shown in Tab. 3, the time cost o
$\textbf{Novelty and significance}$: In my opinion, the empirical results, especially addressing RQ1, clearly set this paper apart from previous research, paving the way to overcome distribution shifts under the TTA setting. The proposed algorithm, SimATTA, significantly surpasses the existing TTA algorithms in terms of performance accuracy under distribution shifts. It also appears to exhibit greater resilience to catastrophic forgetting. Additionally, it is supported by a robust theoretical fr
I found no major weakness from this paper. One minor aspect I would like to highlight concerns the clarity of the experimental settings, such as domain-wise data stream, random stream, post-adaptation, and so on. It took me some time to grasp all of these distinct settings. Perhaps including a dedicated section to explain about these settings would be more helpful.
1. The concept of actively adapting a model during test-time based on interactions with an oracle is innovative. This breaks away from the conventional train-test paradigm, paving the way for more dynamic and adaptive models. 2. The authors provide a theoretical foundation for the ATTA framework, making a compelling case for its viability and potential benefits. 3. The proposed method is model-agnostic, meaning it can be applied to a wide range of machine learning algorithms, from simple linea
1. The ATTA framework's effectiveness hinges on the availability and accuracy of an oracle. In real-world scenarios, obtaining such an oracle (especially a human expert) might be challenging, time-consuming, or expensive. 2. While the approach shows promise on benchmark datasets, its scalability to very large datasets or real-world applications remains untested. The computational overhead of deciding which instances to query and updating the model during test-time could be prohibitive in some s
Code & Models
Videos
Taxonomy
TopicsReal-time simulation and control systems · Advanced Vision and Imaging · Iterative Learning Control Systems
