6D Pose Estimation via Keypoint Heatmap Regression with RGB-D Residual Neural Networks

Ismail Aljosevic; Amir Masoud Almasi; Ana Parovic; Ashkan Shafiei

arXiv:2605.08059·cs.CV·May 11, 2026

6D Pose Estimation via Keypoint Heatmap Regression with RGB-D Residual Neural Networks

Ismail Aljosevic, Amir Masoud Almasi, Ana Parovic, Ashkan Shafiei

PDF

1 Repo

TL;DR

This paper introduces a modular 6D pose estimation framework combining keypoint heatmap regression with RGB-D data, achieving high accuracy on LINEMOD and providing insights on keypoint strategies and fusion methods.

Contribution

It presents a novel RGB-D fusion architecture with cross-stage interaction and evaluates keypoint selection strategies for improved pose accuracy.

Findings

01

RGB-only model achieved 84.50% accuracy on LINEMOD.

02

RGB-D fusion model reached 92.41% accuracy on LINEMOD.

03

Incorporating depth data improves pose estimation performance.

Abstract

In this paper, we propose a modular framework for 6D pose estimation based on keypoint heatmap regression. Our approach combines YOLOv10m for object detection with a ResNet18-based network that predicts 2D heatmaps from RGB images. Keypoints extracted from these heatmaps are used to estimate the 6D object pose via the PnP RANSAC algorithm. We compare different keypoint selection strategies to assess their impact on pose accuracy. Additionally, we extend the baseline by incorporating depth data using a cross-fusion architecture, which enables interaction between RGB and depth features at multiple stages. We further explore general training improvements, such as experimenting with activation functions and learning rate scheduling strategies to improve model performance. Our best RGB-only model achieved a mean ADD-based accuracy of 84.50%, while the RGB-D fusion model reached 92.41% on the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ameermasood/HeatNet
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.