Video Tracking Using Learned Hierarchical Features
Li Wang, Ting Liu, Gang Wang, Kap Luk Chan, Qingxiong Yang

TL;DR
This paper introduces a hierarchical feature learning approach for visual object tracking, combining offline learned features with online domain adaptation to handle complex motions and appearance changes.
Contribution
It presents a novel two-layer CNN with temporal slowness constraints and an online adaptation module for improved tracking performance.
Findings
Significant performance improvements on videos with complex motions.
Robustness to appearance changes of target objects.
Effective integration with existing tracking methods.
Abstract
In this paper, we propose an approach to learn hierarchical features for visual object tracking. First, we offline learn features robust to diverse motion patterns from auxiliary video sequences. The hierarchical features are learned via a two-layer convolutional neural network. Embedding the temporal slowness constraint in the stacked architecture makes the learned features robust to complicated motion transformations, which is important for visual object tracking. Then, given a target video sequence, we propose a domain adaptation module to online adapt the pre-learned features according to the specific target object. The adaptation is conducted in both layers of the deep feature learning module so as to include appearance information of the specific target object. As a result, the learned hierarchical features can be robust to both complicated motion transformations and appearance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
