Feature Flow: In-network Feature Flow Estimation for Video Object   Detection

Ruibing Jin; Guosheng Lin; Changyun Wen; Jianliang Wang; Fayao Liu

arXiv:2009.09660·cs.CV·November 11, 2021

Feature Flow: In-network Feature Flow Estimation for Video Object Detection

Ruibing Jin, Guosheng Lin, Changyun Wen, Jianliang Wang, Fayao Liu

PDF

Open Access

TL;DR

This paper introduces IFF-Net, a novel network with an in-network feature flow estimation module that directly predicts feature displacement for video object detection, eliminating the need for pre-trained optical flow models and achieving state-of-the-art results.

Contribution

The paper proposes a new in-network feature flow estimation module within IFF-Net that directly predicts feature displacement without pre-training, improving detection accuracy and speed.

Findings

01

Outperforms existing methods on ImageNet VID

02

Achieves state-of-the-art detection accuracy

03

Maintains fast inference speed

Abstract

Optical flow, which expresses pixel displacement, is widely used in many computer vision tasks to provide pixel-level motion information. However, with the remarkable progress of the convolutional neural network, recent state-of-the-art approaches are proposed to solve problems directly on feature-level. Since the displacement of feature vector is not consistent to the pixel displacement, a common approach is to:forward optical flow to a neural network and fine-tune this network on the task dataset. With this method,they expect the fine-tuned network to produce tensors encoding feature-level motion information. In this paper, we rethink this de facto paradigm and analyze its drawbacks in the video object detection task. To mitigate these issues, we propose a novel network (IFF-Net) with an \textbf{I}n-network \textbf{F}eature \textbf{F}low estimation module (IFF module) for video object…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Video Surveillance and Tracking Methods · Visual Attention and Saliency Detection