Relational Semantic Reasoning on 3D Scene Graphs for Open World Interactive Object Search
Imen Mahdi, Matteo Cassinelli, Fabien Despinoy, Tim Welschehold, Abhinav Valada

TL;DR
This paper introduces SCOUT, a scene graph-based method for open-world object search that combines relational heuristics with learned utility scores, achieving efficient and effective semantic reasoning in real-world environments.
Contribution
The paper presents SCOUT, a novel scene graph-based exploration method with a procedural distillation framework, and introduces SymSearch, a symbolic benchmark for semantic reasoning evaluation.
Findings
SCOUT outperforms embedding similarity methods in accuracy.
SCOUT matches LLM performance with higher efficiency.
Successful transfer to real-world environments.
Abstract
Open-world interactive object search in household environments requires understanding semantic relationships between objects and their surrounding context to guide exploration efficiently. Prior methods either rely on vision-language embeddings similarity, which does not reliably capture task-relevant relational semantics, or large language models (LLMs), which are too slow and costly for real-time deployment. We introduce SCOUT: Scene Graph-Based Exploration with Learned Utility for Open-World Interactive Object Search, a novel method that searches directly over 3D scene graphs by assigning utility scores to rooms, frontiers, and objects using relational exploration heuristics such as room-object containment and object-object co-occurrence. To make this practical without sacrificing open-vocabulary generalization, we propose an offline procedural distillation framework that extracts…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Robotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques
