FCN-rLSTM: Deep Spatio-Temporal Neural Networks for Vehicle Counting in   City Cameras

Shanghang Zhang; Guanhang Wu; Jo\~ao P. Costeira; Jos\'e M. F. Moura

arXiv:1707.09476·cs.CV·August 2, 2017·24 cites

FCN-rLSTM: Deep Spatio-Temporal Neural Networks for Vehicle Counting in City Cameras

Shanghang Zhang, Guanhang Wu, Jo\~ao P. Costeira, Jos\'e M. F. Moura

PDF

Open Access 1 Repo

TL;DR

This paper introduces FCN-rLSTM, a deep neural network that combines fully convolutional networks and LSTMs with residual learning to accurately count vehicles in low-quality city camera videos, outperforming existing methods.

Contribution

The paper presents a novel FCN-rLSTM architecture with Hyper-Atrous modules for improved vehicle counting in challenging video conditions, and demonstrates significant accuracy and training speed improvements.

Findings

01

Reduces MAE from 5.31 to 4.21 on TRANCOS dataset

02

Reduces MAE from 2.74 to 1.53 on WebCamT dataset

03

Accelerates training process by 5 times on average

Abstract

In this paper, we develop deep spatio-temporal neural networks to sequentially count vehicles from low quality videos captured by city cameras (citycams). Citycam videos have low resolution, low frame rate, high occlusion and large perspective, making most existing methods lose their efficacy. To overcome limitations of existing methods and incorporate the temporal information of traffic video, we design a novel FCN-rLSTM network to jointly estimate vehicle density and vehicle count by connecting fully convolutional neural networks (FCN) with long short term memory networks (LSTM) in a residual learning fashion. Such design leverages the strengths of FCN for pixel-level prediction and the strengths of LSTM for learning complex temporal dynamics. The residual learning connection reformulates the vehicle count regression as learning residual functions with reference to the sum of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dpernes/FCN-rLSTM
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Advanced Neural Network Applications · Human Pose and Action Recognition

MethodsMax Pooling · Sigmoid Activation · Tanh Activation · Convolution · Fully Convolutional Network · Long Short-Term Memory