TransPose: A Transformer-based 6D Object Pose Estimation Network with   Depth Refinement

Mahmoud Abdulsalam; Nabil Aouf

arXiv:2307.05561·cs.CV·July 13, 2023·1 cites

TransPose: A Transformer-based 6D Object Pose Estimation Network with Depth Refinement

Mahmoud Abdulsalam, Nabil Aouf

PDF

Open Access

TL;DR

TransPose is a Transformer-based 6D object pose estimation network that uses RGB images and a depth refinement module to achieve superior accuracy in robotics and agricultural applications.

Contribution

The paper introduces TransPose, a novel Transformer-based architecture with a lightweight depth estimation and refinement module for improved 6D pose estimation from RGB images.

Findings

01

Outperforms state-of-the-art methods in 6D pose estimation

02

Effective in fruit-picking robotic applications

03

Achieves higher accuracy with RGB-only input

Abstract

As demand for robotics manipulation application increases, accurate vision-based 6D pose estimation becomes essential for autonomous operations. Convolutional Neural Networks (CNNs) based approaches for pose estimation have been previously introduced. However, the quest for better performance still persists especially for accurate robotics manipulation. This quest extends to the Agri-robotics domain. In this paper, we propose TransPose, an improved Transformer-based 6D pose estimation with a depth refinement module. The architecture takes in only an RGB image as input with no additional supplementing modalities such as depth or thermal images. The architecture encompasses an innovative lighter depth estimation network that estimates depth from an RGB image using feature pyramid with an up-sampling method. A transformer-based detection network with additional prediction heads is proposed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Advanced Vision and Imaging · Industrial Vision Systems and Defect Detection