Towards Unified and Effective Domain Generalization

Yiyuan Zhang; Kaixiong Gong; Xiaohan Ding; Kaipeng Zhang; Fangrui Lv,; Kurt Keutzer; Xiangyu Yue

arXiv:2310.10008·cs.CV·October 17, 2023·1 cites

Towards Unified and Effective Domain Generalization

Yiyuan Zhang, Kaixiong Gong, Xiaohan Ding, Kaipeng Zhang, Fangrui Lv,, Kurt Keutzer, Xiangyu Yue

PDF

Open Access 1 Repo 3 Reviews

TL;DR

UniDG is a unified, inference-time finetuning framework that improves out-of-distribution generalization of foundation models across various architectures by unsupervised learning and a penalty to prevent catastrophic forgetting.

Contribution

It introduces a novel inference-stage finetuning method with a penalty to enhance domain generalization without additional training.

Findings

01

Average accuracy improvement of +5.4% on DomainBed

02

Effective across 12 diverse visual backbones

03

Reduces catastrophic forgetting during finetuning

Abstract

We propose $UniDG$ , a novel and $Uni$ fied framework for $D$ omain $G$ eneralization that is capable of significantly enhancing the out-of-distribution generalization performance of foundation models regardless of their architectures. The core idea of UniDG is to finetune models during the inference stage, which saves the cost of iterative training. Specifically, we encourage models to learn the distribution of test data in an unsupervised manner and impose a penalty regarding the updating step of model parameters. The penalty term can effectively reduce the catastrophic forgetting issue as we would like to maximally preserve the valuable knowledge in the original model. Empirically, across 12 visual backbones, including CNN-, MLP-, and Transformer-based models, ranging from 1.89M to 303M parameters, UniDG shows an average accuracy improvement of +5.4%…

Peer Reviews

Decision·ICLR 2024 Conference Withdrawn Submission

Reviewer 01Rating 3· reject, not good enoughConfidence 4

Strengths

- The catastrophic forgetting issue during TTA for domain generalization is well motivated.

Weaknesses

- The discussion about related work is not sufficient. In the section of related work, this paper simply listed many related works, but didnot discusses the relation between the proposed method and the mentioned related works. - This paper is more likely to be a Test-Time Domain Adaptation work. So I think Test-Time Domain-Adaptation is more suitable in this paper rather than Domain Generalization. - I dont believe it is the first time to discuss the catastrophic forgetting issue during TTA f

Reviewer 02Rating 8· accept, good paperConfidence 3

Strengths

- Propose a tradeoff b/w freezing the encoder which would lead to underfitting and updating the decoder which would lead to catastrophic forgetting. - Consistently improved benchmarks.

Weaknesses

- Theoretical insight why marginal generalization is important for generalization in unseen domains is explained well in Appendix, but is very unclear from the text of the main paper. I think this important aspect should be better discussed in the main text. - Also motivation for Differentiable Memory Bank should be more clearly written.

Reviewer 03Rating 3· reject, not good enoughConfidence 5

Strengths

In other words, marginal generalization is proposed to update the encoder of Test-Time Adaptation (TTA) and differentiable memory bank is proposed to refine features for DG. Experiments on five datasets such as VLCS, PACS, OfficeHome and so on demonstrate the superiority compared with SOTA methods across 12 different network architectures.

Weaknesses

The structure of paper is unfitable for most ML reader’s habits, especially, the part of related work should follow the introduction. There will be a better logical relationship for most ML conference paper.

Code & Models

Repositories

invictus717/UniDG
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Neural Network Applications