Single Domain Generalization for Few-Shot Counting via Universal Representation Matching
Xianing Chen, Si Huo, Borui Jiang, Hailin Hu, Xinghao Chen

TL;DR
This paper introduces URM, a novel few-shot counting model that leverages universal vision-language representations to enhance domain generalization, achieving state-of-the-art results in unseen scenarios.
Contribution
The paper presents the first single domain generalization approach for few-shot counting, utilizing universal representations to improve robustness across diverse domains.
Findings
URM outperforms existing methods on in-domain data.
URM demonstrates superior generalization to unseen domains.
Universal representations significantly boost domain robustness.
Abstract
Few-shot counting estimates the number of target objects in an image using only a few annotated exemplars. However, domain shift severely hinders existing methods to generalize to unseen scenarios. This falls into the realm of single domain generalization that remains unexplored in few-shot counting. To solve this problem, we begin by analyzing the main limitations of current methods, which typically follow a standard pipeline that extract the object prototypes from exemplars and then match them with image feature to construct the correlation map. We argue that existing methods overlook the significance of learning highly generalized prototypes. Building on this insight, we propose the first single domain generalization few-shot counting model, Universal Representation Matching, termed URM. Our primary contribution is the discovery that incorporating universal vision-language…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Video Surveillance and Tracking Methods · Human Pose and Action Recognition
