Spatial-Temporal Relation Networks for Multi-Object Tracking
Jiarui Xu, Yue Cao, Zheng Zhang, Han Hu

TL;DR
This paper introduces Spatial-Temporal Relation Networks (STRN), a unified, end-to-end framework that effectively encodes multiple cues for similarity measurement in multi-object tracking, achieving state-of-the-art results.
Contribution
The paper proposes a novel unified framework for similarity measurement in MOT that encodes appearance, location, and topology cues simultaneously and performs reasoning across spatial and temporal domains.
Findings
Achieved state-of-the-art accuracy on MOT15-17 benchmarks.
Unified framework encodes multiple cues in a single network.
End-to-end training improves tracking performance.
Abstract
Recent progress in multiple object tracking (MOT) has shown that a robust similarity score is key to the success of trackers. A good similarity score is expected to reflect multiple cues, e.g. appearance, location, and topology, over a long period of time. However, these cues are heterogeneous, making them hard to be combined in a unified network. As a result, existing methods usually encode them in separate networks or require a complex training approach. In this paper, we present a unified framework for similarity measurement which could simultaneously encode various cues and perform reasoning across both spatial and temporal domains. We also study the feature representation of a tracklet-object pair in depth, showing a proper design of the pair features can well empower the trackers. The resulting approach is named spatial-temporal relation networks (STRN). It runs in a feed-forward…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Human Pose and Action Recognition · Human-Animal Interaction Studies
