Crowd Counting using Deep Recurrent Spatial-Aware Network
Lingbo Liu, Hongjun Wang, Guanbin Li, Wanli Ouyang, Liang Lin

TL;DR
This paper introduces a Deep Recurrent Spatial-Aware Network that adaptively handles scale and rotation variations in crowd images, significantly improving counting accuracy over existing methods.
Contribution
The proposed framework uniquely combines a recurrent spatial transformer and local refinement for adaptive crowd density estimation in unconstrained scenes.
Findings
Achieved 12% improvement on WorldExpo'10 dataset
Achieved 22.8% improvement on UCF_CC_50 dataset
Outperformed existing state-of-the-art methods
Abstract
Crowd counting from unconstrained scene images is a crucial task in many real-world applications like urban surveillance and management, but it is greatly challenged by the camera's perspective that causes huge appearance variations in people's scales and rotations. Conventional methods address such challenges by resorting to fixed multi-scale architectures that are often unable to cover the largely varied scales while ignoring the rotation variations. In this paper, we propose a unified neural network framework, named Deep Recurrent Spatial-Aware Network, which adaptively addresses the two issues in a learnable spatial transform module with a region-wise refinement process. Specifically, our framework incorporates a Recurrent Spatial-Aware Refinement (RSAR) module iteratively conducting two components: i) a Spatial Transformer Network that dynamically locates an attentional region from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Anomaly Detection Techniques and Applications · Human Pose and Action Recognition
