Real-Time and Accurate Object Detection in Compressed Video by Long   Short-term Feature Aggregation

Xinggang Wang; Zhaojin Huang; Bencheng Liao; Lichao Huang; Yongchao; Gong; Chang Huang

arXiv:2103.14529·cs.CV·March 29, 2021

Real-Time and Accurate Object Detection in Compressed Video by Long Short-term Feature Aggregation

Xinggang Wang, Zhaojin Huang, Bencheng Liao, Lichao Huang, Yongchao, Gong, Chang Huang

PDF

1 Repo

TL;DR

This paper introduces a real-time video object detection method that efficiently propagates features across frames using short-term aggregation and motion cues, achieving high accuracy and speed on large-scale benchmarks.

Contribution

It proposes a novel short-term feature aggregation technique leveraging motion cues in compressed videos to enhance non-key frame features efficiently.

Findings

01

Achieves 77.2% mAP on ImageNet VID benchmark.

02

Runs at 30 FPS on a Titan X GPU.

03

Outperforms many existing methods in speed and accuracy.

Abstract

Video object detection is a fundamental problem in computer vision and has a wide spectrum of applications. Based on deep networks, video object detection is actively studied for pushing the limits of detection speed and accuracy. To reduce the computation cost, we sparsely sample key frames in video and treat the rest frames are non-key frames; a large and deep network is used to extract features for key frames and a tiny network is used for non-key frames. To enhance the features of non-key frames, we propose a novel short-term feature aggregation method to propagate the rich information in key frame features to non-key frame features in a fast way. The fast feature aggregation is enabled by the freely available motion cues in compressed videos. Further, key frame features are also aggregated based on optical flow. The propagated deep features are then integrated with the directly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hustvl/LSFA
mxnetOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.