TL;DR
This paper addresses the challenge of ranking models in new, unlabeled environments by using a proxy dataset that reflects the target domain's distribution to estimate model performance rankings without costly annotations.
Contribution
It introduces a method to select or sample proxy datasets that preserve model performance rankings in new environments, validated on person re-identification tasks.
Findings
Proxy datasets effectively reflect true model rankings in new environments.
Sampling from similar distributions improves ranking accuracy.
The approach reduces the need for expensive annotations.
Abstract
Consider a scenario where we are supplied with a number of ready-to-use models trained on a certain source domain and hope to directly apply the most appropriate ones to different target domains based on the models' relative performance. Ideally we should annotate a validation set for model performance assessment on each new target environment, but such annotations are often very expensive. Under this circumstance, we introduce the problem of ranking models in unlabeled new environments. For this problem, we propose to adopt a proxy dataset that 1) is fully labeled and 2) well reflects the true model rankings in a given target environment, and use the performance rankings on the proxy sets as surrogates. We first select labeled datasets as the proxy. Specifically, datasets that are more similar to the unlabeled target domain are found to better preserve the relative performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
