Real-time Object Detection for Streaming Perception

Jinrong Yang; Songtao Liu; Zeming Li; Xiaoping Li; Jian Sun

arXiv:2203.12338·cs.CV·March 30, 2022·1 cites

Real-time Object Detection for Streaming Perception

Jinrong Yang, Songtao Liu, Zeming Li, Xiaoping Li, Jian Sun

PDF

Open Access 1 Repo

TL;DR

This paper introduces a real-time streaming perception framework with a novel DualFlow Perception module and trend-aware loss, enabling better future prediction and achieving improved accuracy in autonomous driving scenarios.

Contribution

It proposes a new framework with a DualFlow Perception module and trend-aware loss for enhanced streaming perception in autonomous driving.

Findings

01

Achieves a 4.9% AP improvement on Argoverse-HD dataset.

02

Effectively captures moving trends with DualFlow Perception.

03

Demonstrates the importance of future prediction in real-time perception.

Abstract

Autonomous driving requires the model to perceive the environment and (re)act within a low latency for safety. While past works ignore the inevitable changes in the environment after processing, streaming perception is proposed to jointly evaluate the latency and accuracy into a single metric for video online perception. In this paper, instead of searching trade-offs between accuracy and speed like previous works, we point out that endowing real-time models with the ability to predict the future is the key to dealing with this problem. We build a simple and effective framework for streaming perception. It equips a novel DualFlow Perception module (DFP), which includes dynamic and static flows to capture the moving trend and basic detection feature for streaming prediction. Further, we introduce a Trend-Aware Loss (TAL) combined with a trend factor to generate adaptive weights for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yancie-yjr/StreamYOLO
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · Video Surveillance and Tracking Methods

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings