Contrastive Identification of Covariate Shift in Image Data
Matthew L. Olson, Thuy-Vy Nguyen, Gaurav Dixit, Neale Ratzlaff,, Weng-Keen Wong, and Minsuk Kahng

TL;DR
This paper presents a visual interface to help humans identify covariate shift in high-dimensional image datasets, demonstrating that a density ratio model with nearest-neighbor comparison is most effective.
Contribution
The paper introduces a novel visual interface for characterizing covariate shift in image data and evaluates different representations and workflows through a user study.
Findings
Density ratio model with nearest-neighbor workflow best aids covariate shift detection.
Latent representations influence the effectiveness of covariate shift identification.
User study results guide design choices for visual analysis of high-dimensional data.
Abstract
Identifying covariate shift is crucial for making machine learning systems robust in the real world and for detecting training data biases that are not reflected in test data. However, detecting covariate shift is challenging, especially when the data consists of high-dimensional images, and when multiple types of localized covariate shift affect different subspaces of the data. Although automated techniques can be used to detect the existence of covariate shift, our goal is to help human users characterize the extent of covariate shift in large image datasets with interfaces that seamlessly integrate information obtained from the detection algorithms. In this paper, we design and evaluate a new visual interface that facilitates the comparison of the local distributions of training and test data. We conduct a quantitative user study on multi-attribute facial data to compare two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
