VITAL: VIsual Tracking via Adversarial Learning

Yibing Song; Chao Ma; Xiaohe Wu; Lijun Gong; Linchao Bao; Wangmeng; Zuo; Chunhua Shen; Rynson Lau; Ming-Hsuan Yang

arXiv:1804.04273·cs.CV·April 13, 2018·58 cites

VITAL: VIsual Tracking via Adversarial Learning

Yibing Song, Chao Ma, Xiaohe Wu, Lijun Gong, Linchao Bao, Wangmeng, Zuo, Chunhua Shen, Rynson Lau, Ming-Hsuan Yang

PDF

Open Access

TL;DR

VITAL introduces an adversarial learning-based visual tracking method that enhances positive sample diversity and addresses class imbalance, resulting in improved tracking performance on benchmark datasets.

Contribution

The paper proposes a novel adversarial learning framework for visual tracking that generates diverse positive samples and employs a cost-sensitive loss to handle class imbalance.

Findings

01

Outperforms state-of-the-art trackers on benchmark datasets.

02

Effectively captures appearance variations with generated masks.

03

Reduces impact of easy negatives during training.

Abstract

The tracking-by-detection framework consists of two stages, i.e., drawing samples around the target object in the first stage and classifying each sample as the target object or as background in the second stage. The performance of existing trackers using deep classification networks is limited by two aspects. First, the positive samples in each frame are highly spatially overlapped, and they fail to capture rich appearance variations. Second, there exists extreme class imbalance between positive and negative samples. This paper presents the VITAL algorithm to address these two problems via adversarial learning. To augment positive samples, we use a generative network to randomly generate masks, which are applied to adaptively dropout input features to capture a variety of appearance changes. With the use of adversarial learning, our network identifies the mask that maintains the most…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Fire Detection and Safety Systems · Face recognition and analysis

MethodsDropout