Road User Detection in Videos

Hughes Perreault; Guillaume-Alexandre Bilodeau; Nicolas Saunier,; Pierre Gravel

arXiv:1903.12049·cs.CV·March 29, 2019·6 cites

Road User Detection in Videos

Hughes Perreault, Guillaume-Alexandre Bilodeau, Nicolas Saunier,, Pierre Gravel

PDF

Open Access 1 Repo

TL;DR

This paper introduces two novel models for online road user detection in videos that leverage consecutive frames, demonstrating improved detection performance over single-frame methods, though optical flow integration shows limited benefits.

Contribution

The paper proposes RetinaNet-Double and RetinaNet-Flow models that utilize consecutive frames and optical flow for enhanced video object detection in road scenes.

Findings

01

Using a preceding frame improves detection performance.

02

Explicit optical flow does not significantly enhance detection.

03

Models trained on three public datasets validate the approach.

Abstract

Successive frames of a video are highly redundant, and the most popular object detection methods do not take advantage of this fact. Using multiple consecutive frames can improve detection of small objects or difficult examples and can improve speed and detection consistency in a video sequence, for instance by interpolating features between frames. In this work, a novel approach is introduced to perform online video object detection using two consecutive frames of video sequences involving road users. Two new models, RetinaNet-Double and RetinaNet-Flow, are proposed, based respectively on the concatenation of a target frame with a preceding frame, and the concatenation of the optical flow with the target frame. The models are trained and evaluated on three public datasets. Experiments show that using a preceding frame improves performance over single frame detectors, but using explicit…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hu64/RN-VID
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Advanced Neural Network Applications · Visual Attention and Saliency Detection

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings