Modeling and Propagating CNNs in a Tree Structure for Visual Tracking
Hyeonseob Nam, Mooyeol Baek, Bohyung Han

TL;DR
This paper introduces an online visual tracking method that uses a tree-structured ensemble of CNNs to handle multi-modality in target appearances, improving robustness and efficiency in tracking tasks.
Contribution
It proposes a novel tree-structured CNN ensemble for online tracking, enabling multi-modality handling and reliable model updates with shared parameters to reduce computational costs.
Findings
Outperforms state-of-the-art methods on benchmark datasets
Effectively manages multi-modality in target appearances
Maintains high tracking accuracy with low computational overhead
Abstract
We present an online visual tracking algorithm by managing multiple target appearance models in a tree structure. The proposed algorithm employs Convolutional Neural Networks (CNNs) to represent target appearances, where multiple CNNs collaborate to estimate target states and determine the desirable paths for online model updates in the tree. By maintaining multiple CNNs in diverse branches of tree structure, it is convenient to deal with multi-modality in target appearances and preserve model reliability through smooth updates along tree paths. Since multiple CNNs share all parameters in convolutional layers, it takes advantage of multiple models with little extra cost by saving memory space and avoiding redundant network evaluations. The final target state is estimated by sampling target candidates around the state in the previous frame and identifying the best sample in terms of a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Image Enhancement Techniques · Human Pose and Action Recognition
