Object Detection from Video Tubelets with Convolutional Neural Networks

Kai Kang; Wanli Ouyang; Hongsheng Li; Xiaogang Wang

arXiv:1604.04053·cs.CV·March 7, 2017

Object Detection from Video Tubelets with Convolutional Neural Networks

Kai Kang, Wanli Ouyang, Hongsheng Li, Xiaogang Wang

PDF

1 Repo 1 Video

TL;DR

This paper presents a framework for object detection in videos using CNNs, integrating still-image detection, tracking, and a novel temporal convolution network to improve detection consistency over time.

Contribution

It introduces a comprehensive video object detection framework combining detection, tracking, and a new temporal convolution network for enhanced performance.

Findings

01

Temporal convolution improves detection consistency in videos

02

The framework effectively combines detection and tracking for video analysis

03

Proposed methods outperform baseline approaches in video object detection

Abstract

Deep Convolution Neural Networks (CNNs) have shown impressive performance in various vision tasks such as image classification, object detection and semantic segmentation. For object detection, particularly in still images, the performance has been significantly increased last year thanks to powerful deep networks (e.g. GoogleNet) and detection frameworks (e.g. Regions with CNN features (R-CNN)). The lately introduced ImageNet task on object detection from video (VID) brings the object detection task into the video domain, in which objects' locations at each frame are required to be annotated with bounding boxes. In this work, we introduce a complete framework for the VID task based on still-image object detection and general object tracking. Their relations and contributions in the VID task are thoroughly studied and evaluated. In addition, a temporal convolution network is proposed to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

myfavouritekk/vdetlib
noneOfficial

Videos

Object Detection From Video Tubelets With Convolutional Neural Networks· youtube

Taxonomy

MethodsConvolution