EfficientPose 6D: Scalable and Efficient 6D Object Pose Estimation

Zixuan Fang; Thomas P\"ollabauer; Tristan Wirth; Sarah Berkei; Volker; Knauthe; Arjan Kuijper

arXiv:2502.14061·cs.CV·February 21, 2025

EfficientPose 6D: Scalable and Efficient 6D Object Pose Estimation

Zixuan Fang, Thomas P\"ollabauer, Tristan Wirth, Sarah Berkei, Volker, Knauthe, Arjan Kuijper

PDF

Open Access

TL;DR

This paper introduces EfficientPose 6D, a scalable and fast 6D object pose estimation method that balances accuracy and efficiency for real-time industrial applications, using the novel AMIS algorithm for adaptive model selection.

Contribution

It presents the AMIS algorithm for adaptive model selection, enabling scalable and efficient pose estimation tailored to specific accuracy-speed trade-offs.

Findings

01

Achieves high accuracy on benchmark datasets

02

Demonstrates real-time inference capabilities

03

Outperforms existing methods in efficiency-accuracy balance

Abstract

In industrial applications requiring real-time feedback, such as quality control and robotic manipulation, the demand for high-speed and accurate pose estimation remains critical. Despite advances improving speed and accuracy in pose estimation, finding a balance between computational efficiency and accuracy poses significant challenges in dynamic environments. Most current algorithms lack scalability in estimation time, especially for diverse datasets, and the state-of-the-art (SOTA) methods are often too slow. This study focuses on developing a fast and scalable set of pose estimators based on GDRNPP to meet or exceed current benchmarks in accuracy and robustness, particularly addressing the efficiency-accuracy trade-off essential in real-time scenarios. We propose the AMIS algorithm to tailor the utilized model according to an application-specific trade-off between inference time and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Advanced Vision and Imaging · Hand Gesture Recognition Systems

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Sparse Evolutionary Training