Socratic Planner: Self-QA-Based Zero-Shot Planning for Embodied Instruction Following
Suyeon Shin, Sujin jeon, Junghyun Kim, Gi-Cheon Kang, Byoung-Tak Zhang

TL;DR
The Socratic Planner introduces a zero-shot, self-questioning approach using large language models to generate and adapt plans for embodied instruction following, outperforming existing methods on complex tasks and demonstrating real-world deployment.
Contribution
It presents a novel self-QA-based zero-shot planning method that generates and adjusts plans without training, enhancing embodied instruction following in complex environments.
Findings
Outperforms state-of-the-art models on ALFRED benchmark
Excels in long-horizon, complex inference tasks
Successfully deployed on a physical robot for real-world tasks
Abstract
Embodied Instruction Following (EIF) is the task of executing natural language instructions by navigating and interacting with objects in interactive environments. A key challenge in EIF is compositional task planning, typically addressed through supervised learning or few-shot in-context learning with labeled data. To this end, we introduce the Socratic Planner, a self-QA-based zero-shot planning method that infers an appropriate plan without any further training. The Socratic Planner first facilitates self-questioning and answering by the Large Language Model (LLM), which in turn helps generate a sequence of subgoals. While executing the subgoals, an embodied agent may encounter unexpected situations, such as unforeseen obstacles. The Socratic Planner then adjusts plans based on dense visual feedback through a visually-grounded re-planning mechanism. Experiments demonstrate the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEducation and Critical Thinking Development · Teaching and Learning Programming
