Unleashing Embodied Task Planning Ability in LLMs via Reinforcement Learning
Zhaoye Fei, Li Ji, Siyin Wang, Junhao Shi, Jingjing Gong, Xipeng Qiu

TL;DR
This paper introduces Embodied Planner-R1, a reinforcement learning framework that enables large language models to improve embodied task planning through autonomous exploration, achieving high success rates and strong generalization in text-based environments.
Contribution
The paper presents a novel reinforcement learning approach for LLMs that enhances embodied task planning without human annotations, using group rollout, sparse rewards, and interactive policy optimization.
Findings
Achieves 97.78% success on ALFWorld
Achieves 79.92% success on ScienceWorld
Only -3.66% performance drop in unseen environments
Abstract
Large Language Models (LLMs) have demonstrated remarkable capabilities across various tasks, yet they face significant challenges in embodied task planning scenarios that require continuous environmental understanding and action generation. Existing approaches generate open-loop action scripts based on static knowledge, making it difficult to learn causal relationships between actions and environmental feedback, particularly in partially observable environments. We introduce Embodied Planner-R1, a novel outcome-driven reinforcement learning framework that enables LLMs to develop interactive capabilities through autonomous exploration with minimal supervision. Our framework incorporates three key innovations: (1) Without human annotations, we employ pure reinforcement learning with group rollout, incorporating in-environment interaction through parallel exploration; (2) completion-driven…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Reinforcement Learning in Robotics · Artificial Intelligence in Games
