Crowd counting via scale-adaptive convolutional neural network
Lu Zhang, Miaojing Shi, Qiaobo Chen

TL;DR
This paper introduces SaCNN, a scale-adaptive CNN architecture for crowd counting that effectively handles scale variations and improves accuracy, especially in scenes with few pedestrians, outperforming existing methods.
Contribution
The paper proposes a novel scale-adaptive CNN with a fixed small receptive field backbone and a relative count loss, enhancing crowd counting accuracy across diverse scenes.
Findings
SaCNN outperforms state-of-the-art methods on multiple datasets.
The relative count loss improves performance in scenes with few pedestrians.
Extensive experiments validate the effectiveness of the proposed approach.
Abstract
The task of crowd counting is to automatically estimate the pedestrian number in crowd images. To cope with the scale and perspective changes that commonly exist in crowd images, state-of-the-art approaches employ multi-column CNN architectures to regress density maps of crowd images. Multiple columns have different receptive fields corresponding to pedestrians (heads) of different scales. We instead propose a scale-adaptive CNN (SaCNN) architecture with a backbone of fixed small receptive fields. We extract feature maps from multiple layers and adapt them to have the same output size; we combine them to produce the final density map. The number of people is computed by integrating the density map. We also introduce a relative count loss along with the density map loss to improve the network generalization on crowd scenes with few pedestrians, where most representative approaches…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Anomaly Detection Techniques and Applications · Human Pose and Action Recognition
