Rethinking the Evaluation Protocol of Domain Generalization

Han Yu; Xingxuan Zhang; Renzhe Xu; Jiashuo Liu; Yue He; Peng Cui

arXiv:2305.15253·cs.LG·March 26, 2024·1 cites

Rethinking the Evaluation Protocol of Domain Generalization

Han Yu, Xingxuan Zhang, Renzhe Xu, Jiashuo Liu, Yue He, Peng Cui

PDF

Open Access 1 Repo

TL;DR

This paper critically examines current domain generalization evaluation protocols, identifies potential data leakage issues, and proposes modifications such as self-supervised pretraining and multiple test domains for more accurate assessment.

Contribution

It introduces a revised evaluation protocol for domain generalization that reduces test data leakage and provides new leaderboards for fairer comparison of algorithms.

Findings

01

Current protocols may leak test data information.

02

Self-supervised pretraining improves evaluation fairness.

03

New leaderboards facilitate better benchmarking.

Abstract

Domain generalization aims to solve the challenge of Out-of-Distribution (OOD) generalization by leveraging common knowledge learned from multiple training domains to generalize to unseen test domains. To accurately evaluate the OOD generalization ability, it is required that test data information is unavailable. However, the current domain generalization protocol may still have potential test data information leakage. This paper examines the risks of test data information leakage from two aspects of the current evaluation protocol: supervised pretraining on ImageNet and oracle model selection. We propose modifications to the current protocol that we should employ self-supervised pretraining or train from scratch instead of employing the current supervised pretraining, and we should use multiple test domains. These would result in a more precise evaluation of OOD generalization ability.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

h-yu16/domainbed-v2
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel-Driven Software Engineering Techniques

MethodsTest