Generating High-Quality Crowd Density Maps using Contextual Pyramid CNNs
Vishwanath A. Sindagi, Vishal M. Patel

TL;DR
This paper introduces CP-CNN, a novel deep learning architecture that combines global and local contextual information to generate high-quality crowd density maps and improve counting accuracy.
Contribution
The paper proposes a new multi-module CNN architecture that explicitly incorporates global and local context for crowd density estimation, achieving superior results.
Findings
Significant improvement over state-of-the-art methods
High-resolution, high-quality density maps generated
Effective end-to-end training with adversarial and Euclidean losses
Abstract
We present a novel method called Contextual Pyramid CNN (CP-CNN) for generating high-quality crowd density and count estimation by explicitly incorporating global and local contextual information of crowd images. The proposed CP-CNN consists of four modules: Global Context Estimator (GCE), Local Context Estimator (LCE), Density Map Estimator (DME) and a Fusion-CNN (F-CNN). GCE is a VGG-16 based CNN that encodes global context and it is trained to classify input images into different density classes, whereas LCE is another CNN that encodes local context information and it is trained to perform patch-wise classification of input images into different density classes. DME is a multi-column architecture-based CNN that aims to generate high-dimensional feature maps from the input image which are fused with the contextual information estimated by GCE and LCE using F-CNN. To generate high…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
