Towards High Performance Video Object Detection

Xizhou Zhu; Jifeng Dai; Lu Yuan; Yichen Wei

arXiv:1711.11577·cs.CV·December 1, 2017·6 cites

Towards High Performance Video Object Detection

Xizhou Zhu, Jifeng Dai, Lu Yuan, Yichen Wei

PDF

Open Access

TL;DR

This paper introduces a unified multi-frame end-to-end learning approach for video object detection, incorporating three new techniques to improve speed and accuracy in practical scenarios.

Contribution

It extends prior methods with three novel techniques, advancing the performance of high-quality video object detection.

Findings

01

Achieved improved speed-accuracy tradeoff in video detection

02

Demonstrated effectiveness of multi-frame end-to-end learning

03

Pushed forward the performance envelope in practical scenarios

Abstract

There has been significant progresses for image object detection in recent years. Nevertheless, video object detection has received little attention, although it is more challenging and more important in practical scenarios. Built upon the recent works, this work proposes a unified approach based on the principle of multi-frame end-to-end learning of features and cross-frame motion. Our approach extends prior works with three new techniques and steadily pushes forward the performance envelope (speed-accuracy tradeoff), towards high performance video object detection.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Visual Attention and Saliency Detection