Weight Averaging for Out-of-Distribution Generalization and Few-Shot   Domain Adaptation

Shijian Xu

arXiv:2501.08361·cs.CV·January 16, 2025

Weight Averaging for Out-of-Distribution Generalization and Few-Shot Domain Adaptation

Shijian Xu

PDF

Open Access

TL;DR

This paper explores enhancing out-of-distribution generalization and few-shot domain adaptation by combining weight averaging and sharpness-aware minimization, introducing gradient similarity regularization to increase model diversity.

Contribution

It proposes a novel method that combines WA and SAM with gradient similarity regularization to improve robustness and adaptation in distribution-shift scenarios.

Findings

01

Combining WA and SAM improves out-of-distribution generalization.

02

Gradient similarity regularizer increases model diversity.

03

Method significantly boosts few-shot domain adaptation accuracy.

Abstract

Empirical risk minimization (ERM) is not robust to changes in the distribution of data. When the distribution of test data is different from that of training data, the problem is known as out-of-distribution generalization. Recently, two techniques have been developed for addressing out-of-distribution generalization in computer vision: weight averaging (WA) and sharpness-aware minimization (SAM). WA involves training multiple models with different hyperparameters and then averaging the weights of these models, which can significantly improve out-of-distribution generalization performance. SAM optimizes a neural network to find minima in flat regions, which have been proven to perform well under distribution shifts. While these techniques have made great progress, there is still room for improvement and further exploration. In this thesis, we propose increasing the model diversity in WA…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image Processing Techniques · Seismic Imaging and Inversion Techniques · Advanced Vision and Imaging

MethodsSharpness-Aware Minimization · Segment Anything Model