Exploring Flat Minima for Domain Generalization with Large Learning   Rates

Jian Zhang; Lei Qi; Yinghuan Shi; Yang Gao

arXiv:2309.06337·cs.CV·September 13, 2023

Exploring Flat Minima for Domain Generalization with Large Learning Rates

Jian Zhang, Lei Qi, Yinghuan Shi, Yang Gao

PDF

Open Access

TL;DR

This paper introduces a novel training strategy using large learning rates and weight interpolation to better find flat minima, improving domain generalization performance in classification and segmentation tasks.

Contribution

It proposes the Lookahead training strategy with weight interpolation to effectively identify flat minima using large learning rates, enhancing domain generalization.

Findings

01

Achieves state-of-the-art results on DG benchmarks.

02

Demonstrates effectiveness of large learning rates with weight interpolation.

03

Provides new insights into flat minima identification for DG.

Abstract

Domain Generalization (DG) aims to generalize to arbitrary unseen domains. A promising approach to improve model generalization in DG is the identification of flat minima. One typical method for this task is SWAD, which involves averaging weights along the training trajectory. However, the success of weight averaging depends on the diversity of weights, which is limited when training with a small learning rate. Instead, we observe that leveraging a large learning rate can simultaneously promote weight diversity and facilitate the identification of flat regions in the loss landscape. However, employing a large learning rate suffers from the convergence problem, which cannot be resolved by simply averaging the training weights. To address this issue, we introduce a training strategy called Lookahead which involves the weight interpolation, instead of average, between fast and slow…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Cancer-related molecular mechanisms research

MethodsLookahead