GRID: Scene-Graph-based Instruction-driven Robotic Task Planning
Zhe Ni, Xiaoxin Deng, Cong Tai, Xinyue Zhu, Qinghongbing Xie, Weihang, Huang, Xiang Wu, Long Zeng

TL;DR
This paper introduces GRID, a scene-graph-based approach for robotic task planning that improves understanding and generalization over image-based methods, enabling efficient and accurate instruction execution in diverse environments.
Contribution
GRID leverages scene graphs and LLMs to perceive environment semantics and plan subtasks, reducing reliance on extensive multimodal data and large models.
Findings
Outperforms GPT-4 by 25.4% in subtask accuracy
Achieves 43.6% higher task accuracy
Operates in real-time at 0.11 seconds per inference
Abstract
Recent works have shown that Large Language Models (LLMs) can facilitate the grounding of instructions for robotic task planning. Despite this progress, most existing works have primarily focused on utilizing raw images to aid LLMs in understanding environmental information. However, this approach not only limits the scope of observation but also typically necessitates extensive multimodal data collection and large-scale models. In this paper, we propose a novel approach called Graph-based Robotic Instruction Decomposer (GRID), which leverages scene graphs instead of images to perceive global scene information and iteratively plan subtasks for a given instruction. Our method encodes object attributes and relationships in graphs through an LLM and Graph Attention Networks, integrating instruction features to predict subtasks consisting of pre-defined robot actions and target objects in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Path Planning Algorithms · Reinforcement Learning in Robotics · Teaching and Learning Programming
