Multi-Camera Multiple 3D Object Tracking on the Move for Autonomous Vehicles
Pha Nguyen, Kha Gia Quach, Chi Nhan Duong, Ngan Le, Xuan-Bac Nguyen,, Khoa Luu

TL;DR
This paper introduces a novel global association graph model with link prediction for multi-camera 3D object tracking in autonomous vehicles, improving consistency and accuracy across multiple views and enhancing detection performance.
Contribution
It proposes a new approach combining cross-attention motion modeling and appearance re-identification to address view inconsistency in 3D object tracking.
Findings
Achieves state-of-the-art performance on nuScenes dataset
Improves detection accuracy of standard 3D detectors
Effectively links tracklets across multiple camera views
Abstract
The development of autonomous vehicles provides an opportunity to have a complete set of camera sensors capturing the environment around the car. Thus, it is important for object detection and tracking to address new challenges, such as achieving consistent results across views of cameras. To address these challenges, this work presents a new Global Association Graph Model with Link Prediction approach to predict existing tracklets location and link detections with tracklets via cross-attention motion modeling and appearance re-identification. This approach aims at solving issues caused by inconsistent 3D object detection. Moreover, our model exploits to improve the detection accuracy of a standard 3D object detector in the nuScenes detection challenge. The experimental results on the nuScenes dataset demonstrate the benefits of the proposed method to produce SOTA performance on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Video Surveillance and Tracking Methods · Visual Attention and Saliency Detection
