Fast Vehicle Detection and Tracking on Fisheye Traffic Monitoring Video using CNN and Bounding Box Propagation
Sandy Ardianto, Hsueh-Ming Hang, Wen-Huang Cheng (National Yang Ming, Chiao Tung University)

TL;DR
This paper presents a fast and accurate vehicle detection and tracking algorithm for fisheye traffic videos, leveraging YOLOv5 and bounding box propagation to improve accuracy and speed, especially in challenging nighttime conditions.
Contribution
The authors introduce a bounding box propagation method combined with grayscale frame difference to enhance detection accuracy and processing speed in fisheye traffic videos.
Findings
17.9 percentage points accuracy improvement in nighttime videos
6.2 percentage points accuracy improvement in daytime videos
Double processing speed using grayscale frame difference for intermediate frames
Abstract
We design a fast car detection and tracking algorithm for traffic monitoring fisheye video mounted on crossroads. We use ICIP 2020 VIP Cup dataset and adopt YOLOv5 as the object detection base model. The nighttime video of this dataset is very challenging, and the detection accuracy (AP50) of the base model is about 54%. We design a reliable car detection and tracking algorithm based on the concept of bounding box propagation among frames, which provides 17.9 percentage points (pp) and 6.2 pp. accuracy improvement over the base model for the nighttime and daytime videos, respectively. To speed up, the grayscale frame difference is used for the intermediate frames in a segment, which can double the processing speed.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Neural Network Applications · Autonomous Vehicle Technology and Safety
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Balanced Selection
