CountZES: Counting via Zero-Shot Exemplar Selection

Muhammad Ibraheem Siddiqui; Muhammad Haris Khan

arXiv:2512.16415·cs.CV·February 4, 2026

CountZES: Counting via Zero-Shot Exemplar Selection

Muhammad Ibraheem Siddiqui, Muhammad Haris Khan

PDF

Open Access

TL;DR

CountZES introduces a novel zero-shot object counting method that selects diverse, accurate exemplars through detection refinement, density-guided self-supervision, and feature clustering, outperforming existing approaches across datasets.

Contribution

The paper proposes CountZES, an inference-only zero-shot exemplar selection framework that enhances object counting accuracy by combining detection refinement, density-based exemplar discovery, and feature clustering.

Findings

01

Outperforms existing zero-shot counting methods on multiple datasets.

02

Effectively generalizes across different domains.

03

Achieves superior counting accuracy with diverse exemplar sets.

Abstract

Object counting in complex scenes is particularly challenging in the zero-shot (ZS) setting, where instances of unseen categories are counted using only a class name. Existing ZS counting methods that infer exemplars from text often rely on off-the-shelf open-vocabulary detectors (OVDs), which in dense scenes suffer from semantic noise, appearance variability, and frequent multi-instance proposals. Alternatively, random image-patch sampling is employed, which fails to accurately delineate object instances. To address these issues, we propose CountZES, an inference-only approach for object counting via ZS exemplar selection. CountZES discovers diverse exemplars through three synergistic stages: Detection-Anchored Exemplar (DAE), Density-Guided Exemplar (DGE), and Feature-Consensus Exemplar (FCE). DAE refines OVD detections to isolate precise single-instance exemplars. DGE introduces a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning