Spatiotemporal Modeling for Crowd Counting in Videos
Feng Xiong, Xingjian Shi, Dit-Yan Yeung

TL;DR
This paper introduces a bidirectional ConvLSTM model for crowd counting in videos, effectively capturing spatial and temporal dependencies to improve accuracy, and demonstrates successful transfer learning across datasets.
Contribution
It extends existing CNN-based crowd counting methods by incorporating bidirectional ConvLSTM to exploit temporal information and enables effective transfer learning.
Findings
Bidirectional ConvLSTM improves crowd counting accuracy.
Temporal information significantly boosts model performance.
Transfer learning allows quick adaptation to new datasets.
Abstract
Region of Interest (ROI) crowd counting can be formulated as a regression problem of learning a mapping from an image or a video frame to a crowd density map. Recently, convolutional neural network (CNN) models have achieved promising results for crowd counting. However, even when dealing with video data, CNN-based methods still consider each video frame independently, ignoring the strong temporal correlation between neighboring frames. To exploit the otherwise very useful temporal information in video sequences, we propose a variant of a recent deep learning model called convolutional LSTM (ConvLSTM) for crowd counting. Unlike the previous CNN-based methods, our method fully captures both spatial and temporal dependencies. Furthermore, we extend the ConvLSTM model to a bidirectional ConvLSTM model which can access long-range information in both directions. Extensive experiments using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Anomaly Detection Techniques and Applications · Human Mobility and Location-Based Analysis
MethodsConvolution · ConvLSTM · Sigmoid Activation · Tanh Activation · Long Short-Term Memory
