Unsupervised Domain Adaptation: A Reality Check
Kevin Musgrave, Serge Belongie, Ser-Nam Lim

TL;DR
This paper critically evaluates unsupervised domain adaptation (UDA), revealing that differences between algorithms are smaller than expected and that current validation methods are unreliable, emphasizing the need for better evaluation practices.
Contribution
It provides a large-scale empirical analysis showing the limited effectiveness of current validation methods and the small accuracy differences among UDA algorithms.
Findings
Accuracy differences among UDA algorithms are smaller than previously thought.
Current validation methods do not reliably estimate model accuracy.
Validation methods significantly impact perceived algorithm performance.
Abstract
Interest in unsupervised domain adaptation (UDA) has surged in recent years, resulting in a plethora of new algorithms. However, as is often the case in fast-moving fields, baseline algorithms are not tested to the extent that they should be. Furthermore, little attention has been paid to validation methods, i.e. the methods for estimating the accuracy of a model in the absence of target domain labels. This is despite the fact that validation methods are a crucial component of any UDA train/val pipeline. In this paper, we show via large-scale experimentation that 1) in the oracle setting, the difference in accuracy between UDA algorithms is smaller than previously thought, 2) state-of-the-art validation methods are not well-correlated with accuracy, and 3) differences between UDA algorithms are dwarfed by the drop in accuracy caused by validation methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Machine Learning and Data Classification
