GeneralizeFormer: Layer-Adaptive Model Generation across Test-Time Distribution Shifts
Sameer Ambekar, Zehao Xiao, Xiantong Zhen, Cees G. M. Snoek

TL;DR
GeneralizeFormer is a lightweight, transformer-based approach that dynamically generates layer parameters during inference to improve test-time domain generalization without fine-tuning, effectively handling multiple distribution shifts.
Contribution
It introduces a novel method to generate layer parameters on the fly using a meta-learned transformer, avoiding fine-tuning and enhancing adaptability to distribution shifts.
Findings
Outperforms existing methods on six domain generalization datasets.
Effectively handles multiple target distributions without fine-tuning.
Reduces computational cost by fixing convolutional parameters.
Abstract
We consider the problem of test-time domain generalization, where a model is trained on several source domains and adjusted on target domains never seen during training. Different from the common methods that fine-tune the model or adjust the classifier parameters online, we propose to generate multiple layer parameters on the fly during inference by a lightweight meta-learned transformer, which we call \textit{GeneralizeFormer}. The layer-wise parameters are generated per target batch without fine-tuning or online adjustment. By doing so, our method is more effective in dynamic scenarios with multiple target distributions and also avoids forgetting valuable source distribution characteristics. Moreover, by considering layer-wise gradients, the proposed method adapts itself to various distribution shifts. To reduce the computational and time cost, we fix the convolutional parameters…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Machine Learning and Data Classification · Topic Modeling
MethodsBatch Normalization
