Modernising Reinforcement Learning-Based Navigation for Embodied Semantic Scene Graph Generation

Roman Kueble; Marco Hueller; Mrunmai Phatak; Rainer Lienhart; Joerg Haehner

arXiv:2603.25415·cs.AI·March 27, 2026

Modernising Reinforcement Learning-Based Navigation for Embodied Semantic Scene Graph Generation

Roman Kueble, Marco Hueller, Mrunmai Phatak, Rainer Lienhart, Joerg Haehner

PDF

Open Access

TL;DR

This paper improves embodied semantic scene graph generation by modernizing navigation decision-making, exploring discrete motion sets, and comparing policy architectures, leading to significant gains in scene graph completeness and safety.

Contribution

It introduces a modular navigation component with updated decision-making, including a factorized multi-head policy and new exploration strategies, enhancing scene graph quality and safety.

Findings

01

Replacing the optimization algorithm improves SSG completeness by 21%.

02

Depth-based collision supervision enhances safety without affecting completeness.

03

Finer-grained, factorized actions yield the best completeness-efficiency balance.

Abstract

Semantic world models enable embodied agents to reason about objects, relations, and spatial context beyond purely geometric representations. In Organic Computing, such models are a key enabler for objective-driven self-adaptation under uncertainty and resource constraints. The core challenge is to acquire observations maximising model quality and downstream usefulness within a limited action budget. Semantic scene graphs (SSGs) provide a structured and compact representation for this purpose. However, constructing them within a finite action horizon requires exploration strategies that trade off information gain against navigation cost and decide when additional actions yield diminishing returns. This work presents a modular navigation component for Embodied Semantic Scene Graph Generation and modernises its decision-making by replacing the policy-optimisation method and revisiting…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Reinforcement Learning in Robotics · Human Motion and Animation