Point2Pose: Occlusion-Recovering 6D Pose Tracking and 3D Reconstruction for Multiple Unknown Objects Via 2D Point Trackers

Tzu-Yuan Lin; Ho Jae Lee; Kevin Doherty; Yonghyeon Lee; Sangbae Kim

arXiv:2604.10415·cs.CV·April 14, 2026

Point2Pose: Occlusion-Recovering 6D Pose Tracking and 3D Reconstruction for Multiple Unknown Objects Via 2D Point Trackers

Tzu-Yuan Lin, Ho Jae Lee, Kevin Doherty, Yonghyeon Lee, Sangbae Kim

PDF

TL;DR

Point2Pose is a model-free approach for 6D pose tracking and 3D reconstruction of multiple objects from monocular RGB-D video, capable of handling occlusions and unseen objects without prior models.

Contribution

It introduces a novel multi-object tracking method that recovers from occlusion and reconstructs objects online without requiring object CAD models or category priors.

Findings

01

Achieves state-of-the-art performance on occlusion-heavy benchmarks.

02

Supports tracking of multiple unseen objects without prior models.

03

Recovers object pose after complete occlusion instantly.

Abstract

We present Point2Pose, a model-free method for causal 6D pose tracking of multiple rigid objects from monocular RGB-D video. Initialized only from sparse image points on the objects to be tracked, our approach tracks multiple unseen objects without requiring object CAD models or category priors. Point2Pose leverages a 2D point tracker to obtain long-range correspondences, enabling instant recovery after complete occlusion. Simultaneously, the system incrementally reconstructs an online Truncated Signed Distance Function (TSDF) representation of the tracked targets. Alongside the method, we introduce a new multi-object tracking dataset comprising both simulation and real-world sequences, with motion-capture ground truth for evaluation. Experiments show that Point2Pose achieves performance comparable to the state-of-the-art methods on a severe-occlusion benchmark, while additionally…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.