A Survey on Evaluation of Out-of-Distribution Generalization
Han Yu, Jiashuo Liu, Xingxuan Zhang, Jiayun Wu, Peng Cui

TL;DR
This paper provides a comprehensive review of methods for evaluating out-of-distribution generalization in machine learning, highlighting current paradigms, challenges, and future research directions.
Contribution
It is the first to systematically categorize and analyze existing OOD evaluation methods and discuss their implications and future prospects.
Findings
Categorizes OOD evaluation into performance testing, prediction, and intrinsic property characterization.
Highlights the importance of evaluating where models generalize well or poorly.
Discusses OOD evaluation in the context of pretrained models.
Abstract
Machine learning models, while progressively advanced, rely heavily on the IID assumption, which is often unfulfilled in practice due to inevitable distribution shifts. This renders them susceptible and untrustworthy for deployment in risk-sensitive applications. Such a significant problem has consequently spawned various branches of works dedicated to developing algorithms capable of Out-of-Distribution (OOD) generalization. Despite these efforts, much less attention has been paid to the evaluation of OOD generalization, which is also a complex and fundamental problem. Its goal is not only to assess whether a model's OOD generalization capability is strong or not, but also to evaluate where a model generalizes well or poorly. This entails characterizing the types of distribution shifts that a model can effectively address, and identifying the safe and risky input regions given a model.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Signal Denoising Methods
