Tube-CNN: Modeling temporal evolution of appearance for object detection in video
Tuan-Hung Vu, Anton Osokin, Ivan Laptev

TL;DR
This paper introduces Tube-CNN, a novel approach for object detection in videos that models the temporal evolution of object appearance using space-time tubes and specialized CNN architectures, significantly improving detection accuracy.
Contribution
The paper presents two CNN architectures for generating and classifying space-time tubes, enhancing object detection in videos by capturing temporal appearance changes.
Findings
Improves state-of-the-art on HollywoodHeads and ImageNet VID datasets.
Tube models excel in challenging dynamic scenes.
Proposes a tube proposal network (TPN) for high recall of object tubes.
Abstract
Object detection in video is crucial for many applications. Compared to images, video provides additional cues which can help to disambiguate the detection problem. Our goal in this paper is to learn discriminative models for the temporal evolution of object appearance and to use such models for object detection. To model temporal evolution, we introduce space-time tubes corresponding to temporal sequences of bounding boxes. We propose two CNN architectures for generating and classifying tubes, respectively. Our tube proposal network (TPN) first generates a large number of spatio-temporal tube proposals maximizing object recall. The Tube-CNN then implements a tube-level object detector in the video. Our method improves state of the art on two large-scale datasets for object detection in video: HollywoodHeads and ImageNet VID. Tube models show particular advantages in difficult dynamic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Video Surveillance and Tracking Methods · Advanced Neural Network Applications
