Fusion-Poly: A Polyhedral Framework Based on Spatial-Temporal Fusion for 3D Multi-Object Tracking
Xian Wu, Yitao Wu, Xiaoyu Li, Zijia Li, Lijun Zhao, Lining Sun

TL;DR
Fusion-Poly introduces a novel spatial-temporal fusion framework that leverages asynchronous LiDAR and camera data for 3D multi-object tracking, significantly improving tracking accuracy by utilizing more observations over time.
Contribution
It presents a new framework that integrates asynchronous sensor data for continuous and robust 3D multi-object tracking, surpassing previous synchronized-only methods.
Findings
Achieves 76.5% AMOTA on nuScenes, setting a new state-of-the-art.
Effectively utilizes asynchronous observations for higher-frequency updates.
Validates each component's contribution through extensive ablation studies.
Abstract
LiDAR-camera 3D multi-object tracking (MOT) combines rich visual semantics with accurate depth cues to improve trajectory consistency and tracking reliability. In practice, however, LiDAR and cameras operate at different sampling rates. To maintain temporal alignment, existing data pipelines usually synchronize heterogeneous sensor streams and annotate them at a reduced shared frequency, forcing most prior methods to perform spatial fusion only at synchronized timestamps through projection-based or learnable cross-sensor association. As a result, abundant asynchronous observations remain underexploited, despite their potential to support more frequent association and more robust trajectory estimation over short temporal intervals. To address this limitation, we propose Fusion-Poly, a spatial-temporal fusion framework for 3D MOT that integrates asynchronous LiDAR and camera data.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Robotics and Sensor-Based Localization · Advanced Vision and Imaging
