EfficientPose 6D: Scalable and Efficient 6D Object Pose Estimation
Zixuan Fang, Thomas P\"ollabauer, Tristan Wirth, Sarah Berkei, Volker, Knauthe, Arjan Kuijper

TL;DR
This paper introduces EfficientPose 6D, a scalable and fast 6D object pose estimation method that balances accuracy and efficiency for real-time industrial applications, using the novel AMIS algorithm for adaptive model selection.
Contribution
It presents the AMIS algorithm for adaptive model selection, enabling scalable and efficient pose estimation tailored to specific accuracy-speed trade-offs.
Findings
Achieves high accuracy on benchmark datasets
Demonstrates real-time inference capabilities
Outperforms existing methods in efficiency-accuracy balance
Abstract
In industrial applications requiring real-time feedback, such as quality control and robotic manipulation, the demand for high-speed and accurate pose estimation remains critical. Despite advances improving speed and accuracy in pose estimation, finding a balance between computational efficiency and accuracy poses significant challenges in dynamic environments. Most current algorithms lack scalability in estimation time, especially for diverse datasets, and the state-of-the-art (SOTA) methods are often too slow. This study focuses on developing a fast and scalable set of pose estimators based on GDRNPP to meet or exceed current benchmarks in accuracy and robustness, particularly addressing the efficiency-accuracy trade-off essential in real-time scenarios. We propose the AMIS algorithm to tailor the utilized model according to an application-specific trade-off between inference time and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Advanced Vision and Imaging · Hand Gesture Recognition Systems
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Sparse Evolutionary Training
