Learning Mobile CNN Feature Extraction Toward Fast Computation of Visual   Object Tracking

Tsubasa Murate; Takashi Watanabe; Masaki Yamada

arXiv:2104.01381·cs.CV·April 6, 2021

Learning Mobile CNN Feature Extraction Toward Fast Computation of Visual Object Tracking

Tsubasa Murate, Takashi Watanabe, Masaki Yamada

PDF

Open Access

TL;DR

This paper introduces a lightweight, high-speed CNN-based object tracking method optimized for low-resource environments, utilizing MobileNetV3 and feature selection to achieve high accuracy without online learning.

Contribution

It proposes a novel architecture using MobileNetV3 and feature map selection for efficient, high-precision object tracking suitable for low-resource devices.

Findings

01

Achieves high tracking accuracy on Visual Tracker Benchmark

02

Operates efficiently with no online learning required

03

Suitable for low computational resource environments

Abstract

In this paper, we construct a lightweight, high-precision and high-speed object tracking using a trained CNN. Conventional methods with trained CNNs use VGG16 network which requires powerful computational resources. Therefore, there is a problem that it is difficult to apply in low computation resources environments. To solve this problem, we use MobileNetV3, which is a CNN for mobile terminals.Based on Feature Map Selection Tracking, we propose a new architecture that extracts effective features of MobileNet for object tracking. The architecture requires no online learning but only offline learning. In addition, by using features of objects other than tracking target, the features of tracking target are extracted more efficiently. We measure the tracking accuracy with Visual Tracker Benchmark and confirm that the proposed method can perform high-precision and high-speed calculation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Fire Detection and Safety Systems · Gait Recognition and Analysis

MethodsDepthwise Convolution · Pointwise Convolution · Batch Normalization · Depthwise Separable Convolution · Inverted Residual Block · ReLU6 · 1x1 Convolution · Sigmoid Activation · Convolution · Hard Swish