Learning-Based Hierarchical Scene Graph Matching for Robot Localization Leveraging Prior Maps

Nimrod Millenium Ndulue; Jose Andres Millan-Romera; Matteo Giorgi; Holger Voos; Jose Luis Sanchez-Lopez

arXiv:2604.27821·cs.RO·May 1, 2026

Learning-Based Hierarchical Scene Graph Matching for Robot Localization Leveraging Prior Maps

Nimrod Millenium Ndulue, Jose Andres Millan-Romera, Matteo Giorgi, Holger Voos, Jose Luis Sanchez-Lopez

PDF

TL;DR

This paper introduces a learned, hierarchical scene graph matching method that improves robot localization accuracy and speed by leveraging semantic structures from prior maps and sensor data.

Contribution

It presents an end-to-end differentiable pipeline that exploits hierarchical semantic relationships for scalable, zero-shot scene graph matching in indoor robot localization.

Findings

01

Outperforms combinatorial baseline in F1 score on real LiDAR data.

02

Runs an order of magnitude faster than previous methods.

03

Demonstrates zero-shot generalization to BIM-assisted localization.

Abstract

Accurate localization is a fundamental requirement for autonomous robots operating in indoor environments. Scene graphs encode the spatial structure of an environment as a hierarchy of semantic entities and their relationships, and can be constructed both online from robot sensor data and offline from architectural priors such as Building Information Models (BIM). Matching these two complementary representations enables drift correction in SLAM by grounding robot observations against a known structural prior. However, establishing reliable node-to-node correspondences between them remains an open challenge: existing combinatorial methods are prohibitively expensive at scale, and prior learned approaches address only flat graph matching, ignoring the multi-level semantic structure present in both representations. Here we present a learned, end-to-end differentiable pipeline that augments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.