APR-Transformer: Initial Pose Estimation for Localization in Complex Environments through Absolute Pose Regression

Srinivas Ravuri (1); Yuan Xu (1); Martin Ludwig Zehetner (2); Ketan Motlag (1); and Sahin Albayrak (1) ((1) Technische Universit\"at Berlin; Berlin; Germany (2) Forschungszentrum Informatik; Berlin; Germany)

arXiv:2505.09356·cs.RO·May 15, 2025

APR-Transformer: Initial Pose Estimation for Localization in Complex Environments through Absolute Pose Regression

Srinivas Ravuri (1), Yuan Xu (1), Martin Ludwig Zehetner (2), Ketan Motlag (1), and Sahin Albayrak (1) ((1) Technische Universit\"at Berlin, Berlin, Germany (2) Forschungszentrum Informatik, Berlin, Germany)

PDF

Open Access 1 Repo

TL;DR

APR-Transformer is a novel deep learning model that accurately estimates absolute 3D pose from images or LiDAR, improving localization in GNSS-denied environments and achieving state-of-the-art results on multiple datasets.

Contribution

We introduce APR-Transformer, a new architecture for absolute pose regression that enhances accuracy and robustness in complex environments using deep neural networks.

Findings

01

Achieves state-of-the-art performance on Radar Oxford Robot-Car and DeepLoc datasets.

02

Validates effectiveness in GNSS-denied environments with real-time deployment.

03

Demonstrates practical feasibility on autonomous vehicles.

Abstract

Precise initialization plays a critical role in the performance of localization algorithms, especially in the context of robotics, autonomous driving, and computer vision. Poor localization accuracy is often a consequence of inaccurate initial poses, particularly noticeable in GNSS-denied environments where GPS signals are primarily relied upon for initialization. Recent advances in leveraging deep neural networks for pose regression have led to significant improvements in both accuracy and robustness, especially in estimating complex spatial relationships and orientations. In this paper, we introduce APR-Transformer, a model architecture inspired by state-of-the-art methods, which predicts absolute pose (3D position and 3D orientation) using either image or LiDAR data. We demonstrate that our proposed method achieves state-of-the-art performance on established benchmark datasets such…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gt-arc/apr-transformer
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Hand Gesture Recognition Systems · Image and Object Detection Techniques

MethodsGreedy Policy Search