RSINet: Rotation-Scale Invariant Network for Online Visual Tracking

Yang Fang; Geun-Sik Jo; Chang-Hee Lee

arXiv:2011.09153·cs.CV·November 19, 2020

RSINet: Rotation-Scale Invariant Network for Online Visual Tracking

Yang Fang, Geun-Sik Jo, Chang-Hee Lee

PDF

Open Access

TL;DR

RSINet is a real-time visual tracking network that explicitly learns rotation and scale variations, adaptively updates its model, and achieves state-of-the-art accuracy on multiple benchmarks.

Contribution

The paper introduces RSINet, a novel tracker with explicit rotation-scale estimation and adaptive model updating, improving accuracy and robustness over existing Siamese-based trackers.

Findings

01

Achieves state-of-the-art performance on OTB-100, VOT2018, and LaSOT benchmarks.

02

Runs at approximately 45 FPS in real-time.

03

Effectively estimates rotation and scale transformations during tracking.

Abstract

Most Siamese network-based trackers perform the tracking process without model update, and cannot learn targetspecific variation adaptively. Moreover, Siamese-based trackers infer the new state of tracked objects by generating axis-aligned bounding boxes, which contain extra background noise, and are unable to accurately estimate the rotation and scale transformation of moving objects, thus potentially reducing tracking performance. In this paper, we propose a novel Rotation-Scale Invariant Network (RSINet) to address the above problem. Our RSINet tracker consists of a target-distractor discrimination branch and a rotation-scale estimation branch, the rotation and scale knowledge can be explicitly learned by a multi-task learning method in an end-to-end manner. In addtion, the tracking model is adaptively optimized and updated under spatio-temporal energy control, which ensures model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Human Pose and Action Recognition · Video Analysis and Summarization