Planning as In-Painting: A Diffusion-Based Embodied Task Planning Framework for Environments under Uncertainty
Cheng-Fu Yang, Haoyang Xu, Te-Lin Wu, Xiaofeng Gao, Kai-Wei Chang,, Feng Gao

TL;DR
This paper introduces a diffusion-based framework for embodied AI task planning that generates reliable plans from partial observations and adapts dynamically during execution, improving success in complex environments.
Contribution
It presents a novel 'planning as in-painting' method using diffusion models conditioned on language and perception, with an on-the-fly planning algorithm for better adaptability.
Findings
Achieves promising results in vision-language navigation and object manipulation.
Effectively models state trajectory and goal estimation under partial observability.
Demonstrates robustness in photorealistic virtual environments.
Abstract
Task planning for embodied AI has been one of the most challenging problems where the community does not meet a consensus in terms of formulation. In this paper, we aim to tackle this problem with a unified framework consisting of an end-to-end trainable method and a planning algorithm. Particularly, we propose a task-agnostic method named 'planning as in-painting'. In this method, we use a Denoising Diffusion Model (DDM) for plan generation, conditioned on both language instructions and perceptual inputs under partially observable environments. Partial observation often leads to the model hallucinating the planning. Therefore, our diffusion-based method jointly models both state trajectory and goal estimation to improve the reliability of the generated plan, given the limited available information at each step. To better leverage newly discovered information along the plan execution…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Reinforcement Learning in Robotics
MethodsDiffusion
