Single Shot Video Object Detector

Jiajun Deng; Yingwei Pan; Ting Yao; Wengang Zhou; Houqiang; Li; Tao Mei

arXiv:2007.03560·cs.CV·July 8, 2020

Single Shot Video Object Detector

Jiajun Deng, Yingwei Pan, Ting Yao, Wengang Zhou, Houqiang, Li, Tao Mei

PDF

Open Access 1 Repo

TL;DR

This paper introduces SSVD, a novel single shot video object detector that enhances per-frame features through motion-aware aggregation and feature hallucination, achieving high accuracy and speed on video datasets.

Contribution

The paper presents a new architecture, SSVD, integrating feature aggregation and hallucination into a one-stage detector for improved video object detection.

Findings

01

Achieves 79.2% mAP on ImageNet VID with 85 ms per frame

02

Outperforms existing methods in accuracy and speed

03

Effectively handles appearance deterioration in videos

Abstract

Single shot detectors that are potentially faster and simpler than two-stage detectors tend to be more applicable to object detection in videos. Nevertheless, the extension of such object detectors from image to video is not trivial especially when appearance deterioration exists in videos, \emph{e.g.}, motion blur or occlusion. A valid question is how to explore temporal coherence across frames for boosting detection. In this paper, we propose to address the problem by enhancing per-frame features through aggregation of neighboring frames. Specifically, we present Single Shot Video Object Detector (SSVD) -- a new architecture that novelly integrates feature aggregation into a one-stage detector for object detection in videos. Technically, SSVD takes Feature Pyramid Network (FPN) as backbone network to produce multi-scale features. Unlike the existing feature aggregation methods, SSVD,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ddjiajun/SSVD
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques