Learning to Generate Gradients for Test-Time Adaptation via Test-Time Training Layers
Qi Deng, Shuaicheng Niu, Ronghao Zhang, Yaofo Chen, Runhao Zeng, Jian, Chen, Xiping Hu

TL;DR
This paper introduces a learned optimizer called Meta Gradient Generator (MGG) for test-time adaptation, which effectively utilizes historical gradient information to improve model robustness and speed in adapting to new data distributions.
Contribution
The paper proposes a novel learned optimizer with a gradient memory layer for test-time adaptation, outperforming prior methods in accuracy and efficiency.
Findings
Surpasses state-of-the-art on ImageNet-C, R, Sketch, and A.
Achieves 7.4% accuracy improvement over previous SOTA.
Provides faster adaptation with fewer data and iterations.
Abstract
Test-time adaptation (TTA) aims to fine-tune a trained model online using unlabeled testing data to adapt to new environments or out-of-distribution data, demonstrating broad application potential in real-world scenarios. However, in this optimization process, unsupervised learning objectives like entropy minimization frequently encounter noisy learning signals. These signals produce unreliable gradients, which hinder the model ability to converge to an optimal solution quickly and introduce significant instability into the optimization process. In this paper, we seek to resolve these issues from the perspective of optimizer design. Unlike prior TTA using manually designed optimizers like SGD, we employ a learning-to-optimize approach to automatically learn an optimizer, called Meta Gradient Generator (MGG). Specifically, we aim for MGG to effectively utilize historical gradient…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsModel Reduction and Neural Networks · Domain Adaptation and Few-Shot Learning · Advanced Vision and Imaging
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Stochastic Gradient Descent
