Modernising Reinforcement Learning-Based Navigation for Embodied Semantic Scene Graph Generation
Roman Kueble, Marco Hueller, Mrunmai Phatak, Rainer Lienhart, Joerg Haehner

TL;DR
This paper improves embodied semantic scene graph generation by modernizing navigation decision-making, exploring discrete motion sets, and comparing policy architectures, leading to significant gains in scene graph completeness and safety.
Contribution
It introduces a modular navigation component with updated decision-making, including a factorized multi-head policy and new exploration strategies, enhancing scene graph quality and safety.
Findings
Replacing the optimization algorithm improves SSG completeness by 21%.
Depth-based collision supervision enhances safety without affecting completeness.
Finer-grained, factorized actions yield the best completeness-efficiency balance.
Abstract
Semantic world models enable embodied agents to reason about objects, relations, and spatial context beyond purely geometric representations. In Organic Computing, such models are a key enabler for objective-driven self-adaptation under uncertainty and resource constraints. The core challenge is to acquire observations maximising model quality and downstream usefulness within a limited action budget. Semantic scene graphs (SSGs) provide a structured and compact representation for this purpose. However, constructing them within a finite action horizon requires exploration strategies that trade off information gain against navigation cost and decide when additional actions yield diminishing returns. This work presents a modular navigation component for Embodied Semantic Scene Graph Generation and modernises its decision-making by replacing the policy-optimisation method and revisiting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Reinforcement Learning in Robotics · Human Motion and Animation
