CoDi -- an exemplar-conditioned diffusion model for low-shot counting

Grega \v{S}u\v{s}tar; Jer Pelhan; Alan Luke\v{z}i\v{c}; Matej Kristan

arXiv:2512.20153·cs.CV·December 24, 2025

CoDi -- an exemplar-conditioned diffusion model for low-shot counting

Grega \v{S}u\v{s}tar, Jer Pelhan, Alan Luke\v{z}i\v{c}, Matej Kristan

PDF

Open Access

TL;DR

CoDi introduces a novel exemplar-conditioned diffusion model for low-shot object counting, significantly improving localization and counting accuracy in dense, small-object scenarios compared to existing methods.

Contribution

It is the first latent diffusion-based low-shot counter with an exemplar conditioning module that enhances object localization and counting performance.

Findings

01

Outperforms state-of-the-art by 15% MAE on FSC benchmark

02

Sets new SOTA on MCAC benchmark with 44% MAE improvement

03

Achieves high-quality density maps enabling accurate object localization

Abstract

Low-shot object counting addresses estimating the number of previously unobserved objects in an image using only few or no annotated test-time exemplars. A considerable challenge for modern low-shot counters are dense regions with small objects. While total counts in such situations are typically well addressed by density-based counters, their usefulness is limited by poor localization capabilities. This is better addressed by point-detection-based counters, which are based on query-based detectors. However, due to limited number of pre-trained queries, they underperform on images with very large numbers of objects, and resort to ad-hoc techniques like upsampling and tiling. We propose CoDi, the first latent diffusion-based low-shot counter that produces high-quality density maps on which object locations can be determined by non-maxima suppression. Our core contribution is the new…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Video Surveillance and Tracking Methods · Generative Adversarial Networks and Image Synthesis