DPODv2: Dense Correspondence-Based 6 DoF Pose Estimation

Ivan Shugurov; Sergey Zakharov; Slobodan Ilic

arXiv:2207.02805·cs.CV·July 7, 2022

DPODv2: Dense Correspondence-Based 6 DoF Pose Estimation

Ivan Shugurov, Sergey Zakharov, Slobodan Ilic

PDF

TL;DR

DPODv2 introduces a multi-modal, dense correspondence-based 6 DoF pose estimation method with a novel differentiable rendering refinement, achieving state-of-the-art results across various data types.

Contribution

It presents a unified deep learning framework for RGB and depth data, incorporating a novel differentiable rendering-based pose refinement technique.

Findings

01

RGB excels in correspondence estimation

02

Depth improves pose accuracy with good 3D correspondences

03

Combining RGB and depth yields the best overall performance

Abstract

We propose a three-stage 6 DoF object detection method called DPODv2 (Dense Pose Object Detector) that relies on dense correspondences. We combine a 2D object detector with a dense correspondence estimation network and a multi-view pose refinement method to estimate a full 6 DoF pose. Unlike other deep learning methods that are typically restricted to monocular RGB images, we propose a unified deep learning network allowing different imaging modalities to be used (RGB or Depth). Moreover, we propose a novel pose refinement method, that is based on differentiable rendering. The main concept is to compare predicted and rendered correspondences in multiple views to obtain a pose which is consistent with predicted correspondences in all views. Our proposed method is evaluated rigorously on different data modalities and types of training data in a controlled setup. The main conclusions is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.