COPE: End-to-end trainable Constant Runtime Object Pose Estimation

Stefan Thalhammer; Timothy Patten; Markus Vincze

arXiv:2208.08807·cs.CV·August 23, 2022

COPE: End-to-end trainable Constant Runtime Object Pose Estimation

Stefan Thalhammer, Timothy Patten, Markus Vincze

PDF

Open Access 1 Video

TL;DR

This paper introduces COPE, an end-to-end trainable method for real-time multi-object 6D pose estimation that is faster and more scalable than traditional multi-stage approaches, achieving superior accuracy.

Contribution

COPE is the first end-to-end trainable framework that directly regresses multiple object poses simultaneously, eliminating the need for separate detection and correspondence stages.

Findings

01

Achieves >24 fps on images with over 90 objects.

02

Outperforms state-of-the-art methods in accuracy.

03

Runs approximately 35 times faster than traditional approaches.

Abstract

State-of-the-art object pose estimation handles multiple instances in a test image by using multi-model formulations: detection as a first stage and then separately trained networks per object for 2D-3D geometric correspondence prediction as a second stage. Poses are subsequently estimated using the Perspective-n-Points algorithm at runtime. Unfortunately, multi-model formulations are slow and do not scale well with the number of object instances involved. Recent approaches show that direct 6D object pose estimation is feasible when derived from the aforementioned geometric correspondences. We present an approach that learns an intermediate geometric representation of multiple objects to directly regress 6D poses of all instances in a test image. The inherent end-to-end trainability overcomes the requirement of separately processing individual object instances. By calculating the mutual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

COPE: End-to-end Trainable Constant Runtime Object Pose Estimation· youtube

Taxonomy

TopicsRobot Manipulation and Learning · Human Pose and Action Recognition · Advanced Neural Network Applications

MethodsTest