Diff-DOPE: Differentiable Deep Object Pose Estimation

Jonathan Tremblay; Bowen Wen; Valts Blukis; Balakumar Sundaralingam,; Stephen Tyree; Stan Birchfield

arXiv:2310.00463·cs.CV·October 3, 2023·5 cites

Diff-DOPE: Differentiable Deep Object Pose Estimation

Jonathan Tremblay, Bowen Wen, Valts Blukis, Balakumar Sundaralingam,, Stephen Tyree, Stan Birchfield

PDF

Open Access

TL;DR

Diff-DOPE is a novel, training-free 6-DoF pose refinement method that uses differentiable rendering and gradient descent to achieve state-of-the-art accuracy in object pose estimation.

Contribution

It introduces a differentiable rendering-based pose refinement approach that avoids training, improving accuracy and robustness over existing neural network-based methods.

Findings

01

Achieves state-of-the-art results on pose estimation datasets.

02

Effective with multiple modalities like RGB, depth, and masks.

03

Avoids training by using differentiable rendering and gradient descent.

Abstract

We introduce Diff-DOPE, a 6-DoF pose refiner that takes as input an image, a 3D textured model of an object, and an initial pose of the object. The method uses differentiable rendering to update the object pose to minimize the visual error between the image and the projection of the model. We show that this simple, yet effective, idea is able to achieve state-of-the-art results on pose estimation datasets. Our approach is a departure from recent methods in which the pose refiner is a deep neural network trained on a large synthetic dataset to map inputs to refinement steps. Rather, our use of differentiable rendering allows us to avoid training altogether. Our approach performs multiple gradient descent optimizations in parallel with different random learning rates to avoid local minima from symmetric objects, similar appearances, or wrong step size. Various modalities can be used,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Human Pose and Action Recognition · Advanced Vision and Imaging