Pixel-wise object tracking
Yilin Song, Chenge Li, Yao Wang

TL;DR
This paper introduces a real-time pixel-wise object tracking framework that combines global attention and local segmentation models with LSTM structures, enabling efficient tracking of anonymous objects in noisy backgrounds without online refinement.
Contribution
The novel framework integrates global and local models with LSTMs and an iterative training strategy, eliminating the need for online updates and improving efficiency.
Findings
Effective tracking on challenging VOT dataset
Real-time performance achieved
No online refinement needed for specific objects
Abstract
In this paper, we propose a novel pixel-wise visual object tracking framework that can track any anonymous object in a noisy background. The framework consists of two submodels, a global attention model and a local segmentation model. The global model generates a region of interests (ROI) that the object may lie in the new frame based on the past object segmentation maps, while the local model segments the new image in the ROI. Each model uses a LSTM structure to model the temporal dynamics of the motion and appearance, respectively. To circumvent the dependency of the training data between the two models, we use an iterative update strategy. Once the models are trained, there is no need to refine them to track specific objects, making our method efficient compared to online learning approaches. We demonstrate our real time pixel-wise object tracking framework on a challenging VOT…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Face recognition and analysis · Visual Attention and Saliency Detection
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory
