S.T.A.R.-Track: Latent Motion Models for End-to-End 3D Object Tracking with Adaptive Spatio-Temporal Appearance Representations
Simon Doll, Niklas Hanselmann, Lukas Schneider, Richard Schulz, Markus, Enzweiler, Hendrik P.A. Lensch

TL;DR
This paper presents S.T.A.R.-Track, a transformer-based 3D object tracking framework that uses a novel latent motion model and learnable track embedding to improve accuracy and reduce identity switches, achieving state-of-the-art results.
Contribution
Introduction of a latent motion model and learnable track embedding for end-to-end 3D tracking, enhancing robustness to appearance changes and integrating seamlessly with query-based detectors.
Findings
State-of-the-art performance on nuScenes benchmark.
Significant reduction in identity switches.
Effective modeling of appearance and geometric motion.
Abstract
Following the tracking-by-attention paradigm, this paper introduces an object-centric, transformer-based framework for tracking in 3D. Traditional model-based tracking approaches incorporate the geometric effect of object- and ego motion between frames with a geometric motion model. Inspired by this, we propose S.T.A.R.-Track, which uses a novel latent motion model (LMM) to additionally adjust object queries to account for changes in viewing direction and lighting conditions directly in the latent space, while still modeling the geometric motion explicitly. Combined with a novel learnable track embedding that aids in modeling the existence probability of tracks, this results in a generic tracking framework that can be integrated with any query-based detector. Extensive experiments on the nuScenes benchmark demonstrate the benefits of our approach, showing state-of-the-art performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Human Pose and Action Recognition · Face recognition and analysis
