Multiple Convolutional Features in Siamese Networks for Object Tracking
Zhenxi Li, Guillaume-Alexandre Bilodeau, Wassim Bouachir

TL;DR
The paper introduces MFST, a novel Siamese network-based object tracker that fuses hierarchical features from multiple CNN layers and models to improve tracking accuracy and robustness.
Contribution
It proposes a new multi-feature fusion approach in Siamese trackers, enhancing feature representation for better object tracking performance.
Findings
Outperforms standard Siamese trackers on benchmarks
Utilizes hierarchical feature maps for richer object representation
Handles target appearance variations effectively
Abstract
Siamese trackers demonstrated high performance in object tracking due to their balance between accuracy and speed. Unlike classification-based CNNs, deep similarity networks are specifically designed to address the image similarity problem, and thus are inherently more appropriate for the tracking task. However, Siamese trackers mainly use the last convolutional layers for similarity analysis and target search, which restricts their performance. In this paper, we argue that using a single convolutional layer as feature representation is not an optimal choice in a deep similarity framework. We present a Multiple Features-Siamese Tracker (MFST), a novel tracking algorithm exploiting several hierarchical feature maps for robust tracking. Since convolutional layers provide several abstraction levels in characterizing an object, fusing hierarchical features allows to obtain a richer and more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Human Pose and Action Recognition · Face recognition and analysis
