Category-Level Metric Scale Object Shape and Pose Estimation
Taeyeop Lee, Byeong-Uk Lee, Myungchul Kim, In So Kweon

TL;DR
This paper introduces a novel framework that jointly estimates object shape, size, and pose from a single RGB image, addressing the limitations of existing methods that lack metric scale and accurate positioning.
Contribution
It proposes a dual-branch approach combining metric scale shape estimation and normalized coordinate space prediction for improved 3D object understanding.
Findings
Effective in estimating shape and pose on synthetic datasets
Accurate metric scale and position in real-world scenarios
Outperforms existing category-level pose estimation methods
Abstract
Advances in deep learning recognition have led to accurate object detection with 2D images. However, these 2D perception methods are insufficient for complete 3D world information. Concurrently, advanced 3D shape estimation approaches focus on the shape itself, without considering metric scale. These methods cannot determine the accurate location and orientation of objects. To tackle this problem, we propose a framework that jointly estimates a metric scale shape and pose from a single RGB image. Our framework has two branches: the Metric Scale Object Shape branch (MSOS) and the Normalized Object Coordinate Space branch (NOCS). The MSOS branch estimates the metric scale shape observed in the camera coordinates. The NOCS branch predicts the normalized object coordinate space (NOCS) map and performs similarity transformation with the rendered depth map from a predicted metric scale mesh…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
