DSRRTracker: Dynamic Search Region Refinement for Attention-based Siamese Multi-Object Tracking
JiaXu Wan, Hong Zhang, Jin Zhang, Yuan Ding, Yifan Yang, Yan Li and, Xuliang Li

TL;DR
This paper introduces DSRRTracker, an end-to-end multi-object tracking method that dynamically refines search regions and employs attention mechanisms, achieving state-of-the-art results efficiently.
Contribution
It proposes a novel dynamic search region refinement module inspired by Gaussian filtering and a lightweight attention-based tracking head for improved association.
Findings
Achieves state-of-the-art performance on MOT17 and MOT20 datasets.
Operates with reasonable computational speed.
Demonstrates effectiveness through extensive experiments and ablation studies.
Abstract
Many multi-object tracking (MOT) methods follow the framework of "tracking by detection", which associates the target objects-of-interest based on the detection results. However, due to the separate models for detection and association, the tracking results are not optimal.Moreover, the speed is limited by some cumbersome association methods to achieve high tracking performance. In this work, we propose an end-to-end MOT method, with a Gaussian filter-inspired dynamic search region refinement module to dynamically filter and refine the search region by considering both the template information from the past frames and the detection results from the current frame with little computational burden, and a lightweight attention-based tracking head to achieve the effective fine-grained instance association. Extensive experiments and ablation study on MOT17 and MOT20 datasets demonstrate that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Advanced Chemical Sensor Technologies · Fire Detection and Safety Systems
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
