Weight Averaging for Out-of-Distribution Generalization and Few-Shot Domain Adaptation
Shijian Xu

TL;DR
This paper explores enhancing out-of-distribution generalization and few-shot domain adaptation by combining weight averaging and sharpness-aware minimization, introducing gradient similarity regularization to increase model diversity.
Contribution
It proposes a novel method that combines WA and SAM with gradient similarity regularization to improve robustness and adaptation in distribution-shift scenarios.
Findings
Combining WA and SAM improves out-of-distribution generalization.
Gradient similarity regularizer increases model diversity.
Method significantly boosts few-shot domain adaptation accuracy.
Abstract
Empirical risk minimization (ERM) is not robust to changes in the distribution of data. When the distribution of test data is different from that of training data, the problem is known as out-of-distribution generalization. Recently, two techniques have been developed for addressing out-of-distribution generalization in computer vision: weight averaging (WA) and sharpness-aware minimization (SAM). WA involves training multiple models with different hyperparameters and then averaging the weights of these models, which can significantly improve out-of-distribution generalization performance. SAM optimizes a neural network to find minima in flat regions, which have been proven to perform well under distribution shifts. While these techniques have made great progress, there is still room for improvement and further exploration. In this thesis, we propose increasing the model diversity in WA…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image Processing Techniques · Seismic Imaging and Inversion Techniques · Advanced Vision and Imaging
MethodsSharpness-Aware Minimization · Segment Anything Model
