Exploring Flat Minima for Domain Generalization with Large Learning Rates
Jian Zhang, Lei Qi, Yinghuan Shi, Yang Gao

TL;DR
This paper introduces a novel training strategy using large learning rates and weight interpolation to better find flat minima, improving domain generalization performance in classification and segmentation tasks.
Contribution
It proposes the Lookahead training strategy with weight interpolation to effectively identify flat minima using large learning rates, enhancing domain generalization.
Findings
Achieves state-of-the-art results on DG benchmarks.
Demonstrates effectiveness of large learning rates with weight interpolation.
Provides new insights into flat minima identification for DG.
Abstract
Domain Generalization (DG) aims to generalize to arbitrary unseen domains. A promising approach to improve model generalization in DG is the identification of flat minima. One typical method for this task is SWAD, which involves averaging weights along the training trajectory. However, the success of weight averaging depends on the diversity of weights, which is limited when training with a small learning rate. Instead, we observe that leveraging a large learning rate can simultaneously promote weight diversity and facilitate the identification of flat regions in the loss landscape. However, employing a large learning rate suffers from the convergence problem, which cannot be resolved by simply averaging the training weights. To address this issue, we introduce a training strategy called Lookahead which involves the weight interpolation, instead of average, between fast and slow…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Cancer-related molecular mechanisms research
MethodsLookahead
