3D Ground Truth Reconstruction from Multi-Camera Annotations Using UKF
Linh Van Ma, Unse Fatima, Tepy Sokun Chriv, Haroon Imran, Moongu Jeon

TL;DR
This paper presents a novel UKF-based method for accurately reconstructing 3D ground truth objects from multi-camera 2D annotations, improving 3D localization and shape estimation in autonomous systems.
Contribution
It introduces a UKF-based fusion approach that transforms 2D annotations into full 3D object shapes, handling occlusions and providing a scalable automatic solution.
Findings
High accuracy in 3D localization on multiple datasets
Full 3D shape reconstruction of objects
Effective occlusion handling in multi-camera setups
Abstract
Accurate 3D ground truth estimation is critical for applications such as autonomous navigation, surveillance, and robotics. This paper introduces a novel method that uses an Unscented Kalman Filter (UKF) to fuse 2D bounding box or pose keypoint ground truth annotations from multiple calibrated cameras into accurate 3D ground truth. By leveraging human-annotated ground-truth 2D, our proposed method, a multi-camera single-object tracking algorithm, transforms 2D image coordinates into robust 3D world coordinates through homography-based projection and UKF-based fusion. Our proposed algorithm processes multi-view data to estimate object positions and shapes while effectively handling challenges such as occlusion. We evaluate our method on the CMC, Wildtrack, and Panoptic datasets, demonstrating high accuracy in 3D localization compared to the available 3D ground truth. Unlike existing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Vision and Imaging · Advanced Image and Video Retrieval Techniques
