RCGNet: RGB-based Category-Level 6D Object Pose Estimation with Geometric Guidance

Sheng Yu; Di-Hua Zhai; Yuanqing Xia

arXiv:2508.13623·cs.CV·August 20, 2025

RCGNet: RGB-based Category-Level 6D Object Pose Estimation with Geometric Guidance

Sheng Yu, Di-Hua Zhai, Yuanqing Xia

PDF

TL;DR

This paper introduces RCGNet, a transformer-based neural network that estimates 6D object pose from RGB images alone, using geometric guidance and RANSAC-PnP for improved accuracy and efficiency in real-world scenarios.

Contribution

The paper presents a novel RGB-only category-level pose estimation method with a geometric feature-guided algorithm and transformer architecture, eliminating the need for depth data.

Findings

01

Achieves superior accuracy over previous RGB-based methods

02

Demonstrates high efficiency on benchmark datasets

03

Effectively handles variable object scales

Abstract

While most current RGB-D-based category-level object pose estimation methods achieve strong performance, they face significant challenges in scenes lacking depth information. In this paper, we propose a novel category-level object pose estimation approach that relies solely on RGB images. This method enables accurate pose estimation in real-world scenarios without the need for depth data. Specifically, we design a transformer-based neural network for category-level object pose estimation, where the transformer is employed to predict and fuse the geometric features of the target object. To ensure that these predicted geometric features faithfully capture the object's geometry, we introduce a geometric feature-guided algorithm, which enhances the network's ability to effectively represent the object's geometric information. Finally, we utilize the RANSAC-PnP algorithm to compute the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.