MPF6D: Masked Pyramid Fusion 6D Pose Estimation
Nuno Pereira, Lu\'is A. Alexandre

TL;DR
MPF6D introduces a real-time 6D object pose estimation method using RGB-D data, leveraging a pyramid neural network architecture for improved accuracy over existing approaches.
Contribution
The paper proposes a novel neural network architecture with multiple heads and pyramid feature fusion for enhanced 6D pose estimation from RGB-D data.
Findings
Achieves higher accuracy than state-of-the-art methods
Operates in real-time with low inference time
Validated on two common datasets
Abstract
Object pose estimation has multiple important applications, such as robotic grasping and augmented reality. We present a new method to estimate the 6D pose of objects that improves upon the accuracy of current proposals and can still be used in real-time. Our method uses RGB-D data as input to segment objects and estimate their pose. It uses a neural network with multiple heads to identify the objects in the scene, generate the appropriate masks and estimate the values of the translation vectors and the quaternion that represents the objects' rotation. These heads leverage a pyramid architecture used during feature extraction and feature fusion. We conduct an empirical evaluation using the two most common datasets in the area, and compare against state-of-the-art approaches, illustrating the capabilities of MPF6D. Our method can be used in real-time with its low inference time and high…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Robotics and Sensor-Based Localization · Advanced Vision and Imaging
