Model-free Tracking with Deep Appearance and Motion Features Integration
Xiaolong Jiang, Peizhao Li, Xiantong Zhen, Xianbin Cao

TL;DR
This paper introduces AMNet, a real-time, model-free object tracking framework that integrates appearance and motion features using deep CNNs, achieving high accuracy without prior object information.
Contribution
It presents a novel end-to-end two-stream CNN architecture that combines appearance and motion cues for anonymous object tracking, advancing model-free tracking methods.
Findings
Achieves state-of-the-art performance on OTB and VOT benchmarks.
Operates in real-time with favorable speed.
Effectively tracks anonymous objects without prior object models.
Abstract
Being able to track an anonymous object, a model-free tracker is comprehensively applicable regardless of the target type. However, designing such a generalized framework is challenged by the lack of object-oriented prior information. As one solution, a real-time model-free object tracking approach is designed in this work relying on Convolutional Neural Networks (CNNs). To overcome the object-centric information scarcity, both appearance and motion features are deeply integrated by the proposed AMNet, which is an end-to-end offline trained two-stream network. Between the two parallel streams, the ANet investigates appearance features with a multi-scale Siamese atrous CNN, enabling the tracking-by-matching strategy. The MNet achieves deep motion detection to localize anonymous moving objects by processing generic motion features. The final tracking result at each frame is generated by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Human Pose and Action Recognition · Face recognition and analysis
