CenterSnap: Single-Shot Multi-Object 3D Shape Reconstruction and Categorical 6D Pose and Size Estimation
Muhammad Zubair Irshad, Thomas Kollar, Michael Laskey, Kevin Stone,, Zsolt Kira

TL;DR
CenterSnap introduces a real-time, single-shot method for multi-object 3D shape reconstruction and 6D pose estimation from RGB-D images, outperforming existing multi-stage approaches especially in complex scenarios.
Contribution
The paper proposes a novel one-stage, bounding-box free approach that jointly predicts 3D shape, pose, and size for multiple objects in a single forward pass.
Findings
Achieves real-time processing at 40 FPS.
Outperforms existing methods with a 12.6% absolute improvement in 6D pose accuracy.
Effectively handles occlusions and complex multi-object scenes.
Abstract
This paper studies the complex task of simultaneous multi-object 3D reconstruction, 6D pose and size estimation from a single-view RGB-D observation. In contrast to instance-level pose estimation, we focus on a more challenging problem where CAD models are not available at inference time. Existing approaches mainly follow a complex multi-stage pipeline which first localizes and detects each object instance in the image and then regresses to either their 3D meshes or 6D poses. These approaches suffer from high-computational cost and low performance in complex multi-object scenarios, where occlusions can be present. Hence, we present a simple one-stage approach to predict both the 3D shape and estimate the 6D pose and size jointly in a bounding-box free manner. In particular, our method treats object instances as spatial centers where each center denotes the complete shape of an object…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Advanced Vision and Imaging · 3D Shape Modeling and Analysis
