CrowdNet: A Deep Convolutional Network for Dense Crowd Counting
Lokesh Boominathan, Srinivas S S Kruthiventi, R. Venkatesh Babu

TL;DR
This paper introduces CrowdNet, a deep convolutional network that combines deep and shallow features to accurately estimate crowd density from images, effectively handling scale variations and limited training data.
Contribution
CrowdNet is a novel deep learning framework that integrates multi-scale data augmentation and combined feature extraction for dense crowd counting.
Findings
Outperforms state-of-the-art methods on UCF_CC_50 dataset
Effectively captures both high-level and low-level features for crowd density estimation
Uses multi-scale data augmentation to handle limited training data
Abstract
Our work proposes a novel deep learning framework for estimating crowd density from static images of highly dense crowds. We use a combination of deep and shallow, fully convolutional networks to predict the density map for a given crowd image. Such a combination is used for effectively capturing both the high-level semantic information (face/body detectors) and the low-level features (blob detectors), that are necessary for crowd counting under large scale variations. As most crowd datasets have limited training samples (<100 images) and deep learning based approaches require large amounts of training data, we perform multi-scale data augmentation. Augmenting the training samples in such a manner helps in guiding the CNN to learn scale invariant representations. Our method is tested on the challenging UCF_CC_50 dataset, and shown to outperform the state of the art methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Anomaly Detection Techniques and Applications · Human Pose and Action Recognition
