CRT-6D: Fast 6D Object Pose Estimation with Cascaded Refinement   Transformers

Pedro Castro; Tae-Kyun Kim

arXiv:2210.11718·cs.CV·October 24, 2022·1 cites

CRT-6D: Fast 6D Object Pose Estimation with Cascaded Refinement Transformers

Pedro Castro, Tae-Kyun Kim

PDF

Open Access 1 Repo 1 Video

TL;DR

CRT-6D introduces a fast, transformer-based 6D object pose estimation method that uses sparse keypoint features and iterative refinement, achieving state-of-the-art accuracy with significantly improved speed.

Contribution

The paper proposes CRT-6D, a novel cascaded transformer approach using sparse surface keypoints for efficient and accurate 6D pose estimation, outperforming existing real-time methods.

Findings

01

Inference runtime is 2x faster than closest real-time methods.

02

Supports up to 21 objects simultaneously.

03

Achieves state-of-the-art accuracy on LM-O and YCB-V datasets.

Abstract

Learning based 6D object pose estimation methods rely on computing large intermediate pose representations and/or iteratively refining an initial estimation with a slow render-compare pipeline. This paper introduces a novel method we call Cascaded Pose Refinement Transformers, or CRT-6D. We replace the commonly used dense intermediate representation with a sparse set of features sampled from the feature pyramid we call OSKFs(Object Surface Keypoint Features) where each element corresponds to an object keypoint. We employ lightweight deformable transformers and chain them together to iteratively refine proposed poses over the sampled OSKFs. We achieve inference runtimes 2x faster than the closest real-time state of the art methods while supporting up to 21 objects on a single model. We demonstrate the effectiveness of CRT-6D by performing extensive experiments on the LM-O and YCBV…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pedrocastro/crt-6d
noneOfficial

Videos

CRT6D : Fast 6D Object Pose Estimation with Cascaded Refinement Transformers· youtube

Taxonomy

TopicsRobot Manipulation and Learning · Human Pose and Action Recognition · Multimodal Machine Learning Applications