SG-CoT: An Ambiguity-Aware Robotic Planning Framework using Scene Graph Representations
Akshat Rana, Peeyush Agarwal, K.P.S. Rana, Amarjit Malhotra

TL;DR
SG-CoT is a novel framework that enhances robotic planning by using scene graph representations and iterative language model querying to resolve ambiguities, leading to more reliable and generalizable robot behavior.
Contribution
It introduces a two-stage scene graph-based framework enabling LLMs to detect, clarify, and resolve ambiguities in robotic planning tasks.
Findings
10%+ improvement in question accuracy
4%+ success rate increase in single-agent environments
15%+ success rate increase in multi-agent environments
Abstract
Ambiguity poses a major challenge to large language models (LLMs) used as robotic planners. In this letter, we present Scene Graph-Chain-of-Thought (SG-CoT), a two-stage framework where LLMs iteratively query a scene graph representation of the environment to detect and clarify ambiguities. First, a structured scene graph representation of the environment is constructed from input observations, capturing objects, their attributes, and relationships with other objects. Second, the LLM is equipped with retrieval functions to query portions of the scene graph that are relevant to the provided instruction. This grounds the reasoning process of the LLM in the observation, increasing the reliability of robotic planners under ambiguous situations. SG-CoT also allows the LLM to identify the source of ambiguity and pose a relevant disambiguation question to the user or another robot. Extensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · AI-based Problem Solving and Planning · Artificial Intelligence in Games
