VI-Net: Boosting Category-level 6D Object Pose Estimation via Learning   Decoupled Rotations on the Spherical Representations

Jiehong Lin; Zewei Wei; Yabin Zhang; Kui Jia

arXiv:2308.09916·cs.CV·August 22, 2023·1 cites

VI-Net: Boosting Category-level 6D Object Pose Estimation via Learning Decoupled Rotations on the Spherical Representations

Jiehong Lin, Zewei Wei, Yabin Zhang, Kui Jia

PDF

Open Access 1 Repo

TL;DR

VI-Net introduces a decoupled rotation estimation approach on spherical representations, significantly improving high-precision 6D object pose estimation from RGB-D data, especially for unknown objects without CAD models.

Contribution

The paper proposes VI-Net, a novel network that decouples rotation into viewpoint and in-plane components using spherical signals and a new spherical convolution, enhancing pose estimation accuracy.

Findings

01

Outperforms existing methods on benchmark datasets.

02

Achieves high-precision pose estimation for unknown objects.

03

Effectively estimates rotations by decoupling and spherical feature learning.

Abstract

Rotation estimation of high precision from an RGB-D object observation is a huge challenge in 6D object pose estimation, due to the difficulty of learning in the non-linear space of SO(3). In this paper, we propose a novel rotation estimation network, termed as VI-Net, to make the task easier by decoupling the rotation as the combination of a viewpoint rotation and an in-plane rotation. More specifically, VI-Net bases the feature learning on the sphere with two individual branches for the estimates of two factorized rotations, where a V-Branch is employed to learn the viewpoint rotation via binary classification on the spherical signals, while another I-Branch is used to estimate the in-plane rotation by transforming the signals to view from the zenith direction. To process the spherical signals, a Spherical Feature Pyramid Network is constructed based on a novel design of SPAtial…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jiehonglin/vi-net
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · Augmented Reality Applications · Robot Manipulation and Learning

MethodsConvolution