Analytic Continual Test-Time Adaptation for Multi-Modality Corruption
Yufei Zhang, Yicheng Xu, Hongxin Wei, Zhiping Lin, Xiaofeng Zou, Cen Chen, Huiping Zhuang

TL;DR
This paper introduces MDAA, a novel method for multi-modal continual test-time adaptation that effectively handles domain shifts and multi-modal corruptions by mitigating catastrophic forgetting and dynamically fusing modalities.
Contribution
The paper proposes MDAA, combining analytic learning with dynamic late fusion, to improve multi-modal test-time adaptation under domain shifts and corruptions.
Findings
MDAA achieves state-of-the-art results on multi-modal TTA tasks.
The analytic classifiers effectively reduce catastrophic forgetting.
Dynamic late fusion improves reliability in multi-modal integration.
Abstract
Test-Time Adaptation (TTA) enables pre-trained models to bridge the gap between source and target datasets using unlabeled test data, addressing domain shifts caused by corruptions like weather changes, noise, or sensor malfunctions in test time. Multi-Modal Continual Test-Time Adaptation (MM-CTTA), as an extension of standard TTA, further allows models to handle multi-modal inputs and adapt to continuously evolving target domains. However, MM-CTTA faces critical challenges such as catastrophic forgetting and reliability bias, which are rarely addressed effectively under multi-modal corruption scenarios. In this paper, we propose a novel approach, Multi-modality Dynamic Analytic Adapter (MDAA), to tackle MM-CTTA tasks. MDAA introduces analytic learning,a closed-form training technique,through Analytic Classifiers (ACs) to mitigate catastrophic forgetting. Furthermore, we design the…
Peer Reviews
Decision·Submitted to ICLR 2025
- Conitnual domain shift poses real challenges to existing deep learning models. Tackling the continual domain shift could improve the robustness and generalization of deep learning models. - Existing test-time adaptation methods have not addressed the unique challenges in multi-modality corruptions. This paper explored an important but largely dismissed problem.
- The proposed test-time adaptation method is by simply optimizing the ridge regression problem on both source and target data, as per Eq. (12). The innovation is similar to adding a cross-entropy loss with pseudo labels on target data in regular classification tasks. It is unclear what are the main technical challenges in deploying test-time adaptation methods to multi-modality data. - Although the paper is motivated by the continual domain shift challenges, the proposed method does not specif
1. The paper has a clear motivation for the problem, clearly expresses the existing challenges in the field, and proposes innovative solutions to address them. 2. The method proposed in this article has efficient adaptability, that is, it can adapt without accessing the source data, saving computing resources and protecting data privacy. 3. This method can handle multimodal inputs and has more practical application potential compared to single modal methods. 4. The writing is coherent and logica
1. The analysis of the differences between the proposed method and existing methods, as well as the limitations of the analysis method is missing. 2. In the paper, the author proposed three modules (Acs, DSM, and SPS). However, in the ablation experiment, the author lacked experiments on ACs and SPS, as well as DSM and SPS. 3. The running time of the proposed method on the model is also an important part of verifying the superiority of the method. Please supplement the running time of existing
1. The majority of work in the continual test-time adaptation area focus on a single modality. While some works exist in the multi-modal space, they do not address the catastrophic forgetting issue well. MDAA introduces a recursive learning scheme which is tailored for tackling this issue. This makes MDAA highly relevant for practical applications. 2. Extensive experiment results on multiple scenarios are provided.
1. The classifiers in MDAA are fitted using a least squares loss function. It is well-known that models trained with least squares loss can lack robustness to outliers and are often outperformed by models trained with cross-entropy loss. It would be useful to see a comparison of the source model performance when trained with cross-entropy loss versus least squares loss. 2. The formulation restricts adaptation only to the classification layer. It has been shown in [1] how adapting the encoders
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNon-Destructive Testing Techniques
MethodsAdapter
