Stacked Pooling: Improving Crowd Counting by Boosting Scale Invariance
Siyu Huang, Xi Li, Zhi-Qi Cheng, Zhongfei Zhang, Alexander Hauptmann

TL;DR
This paper introduces stacked pooling, a simple yet effective method to enhance scale invariance in CNNs for crowd counting, leading to improved accuracy without extra parameters.
Contribution
It proposes stacked pooling as a computationally efficient alternative to multi-kernel pooling to boost scale invariance in crowd counting models.
Findings
Stacked pooling outperforms vanilla pooling in most benchmark tests.
The method improves crowd density estimation accuracy.
No additional parameters are introduced by the pooling modules.
Abstract
In this work, we explore the cross-scale similarity in crowd counting scenario, in which the regions of different scales often exhibit high visual similarity. This feature is universal both within an image and across different images, indicating the importance of scale invariance of a crowd counting model. Motivated by this, in this paper we propose simple but effective variants of pooling module, i.e., multi-kernel pooling and stacked pooling, to boost the scale invariance of convolutional neural networks (CNNs), benefiting much the crowd density estimation and counting. Specifically, the multi-kernel pooling comprises of pooling kernels with multiple receptive fields to capture the responses at multi-scale local ranges. The stacked pooling is an equivalent form of multi-kernel pooling, while, it reduces considerable computing cost. Our proposed pooling modules do not introduce extra…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvacuation and Crowd Dynamics · Video Surveillance and Tracking Methods · Anomaly Detection Techniques and Applications
