Direct Measure Matching for Crowd Counting
Hui Lin, Xiaopeng Hong, Zhiheng Ma, Xing Wei, Yunfeng Qiu, Yaowei, Wang, Yihong Gong

TL;DR
This paper introduces a novel measure-based crowd counting method that directly matches predicted density maps to scattered point annotations using a Sinkhorn divergence, improving accuracy and robustness.
Contribution
It proposes a measure matching framework with a Sinkhorn divergence loss and a self-supervised scale consistency mechanism for crowd counting.
Findings
Outperforms existing methods on four challenging datasets.
Effectively handles scale variations with the Sinkhorn scale consistency loss.
Demonstrates robustness and improved accuracy in crowd density estimation.
Abstract
Traditional crowd counting approaches usually use Gaussian assumption to generate pseudo density ground truth, which suffers from problems like inaccurate estimation of the Gaussian kernel sizes. In this paper, we propose a new measure-based counting approach to regress the predicted density maps to the scattered point-annotated ground truth directly. First, crowd counting is formulated as a measure matching problem. Second, we derive a semi-balanced form of Sinkhorn divergence, based on which a Sinkhorn counting loss is designed for measure matching. Third, we propose a self-supervised mechanism by devising a Sinkhorn scale consistency loss to resist scale changes. Finally, an efficient optimization method is provided to minimize the overall loss function. Extensive experiments on four challenging crowd counting datasets namely ShanghaiTech, UCF-QNRF, JHU++, and NWPU have validated the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Anomaly Detection Techniques and Applications · Human Mobility and Location-Based Analysis
