Luban: Building Open-Ended Creative Agents via Autonomous Embodied   Verification

Yuxuan Guo; Shaohui Peng; Jiaming Guo; Di Huang; Xishan Zhang; Rui; Zhang; Yifan Hao; Ling Li; Zikang Tian; Mingju Gao; Yutai Li; Yiming Gan,; Shuai Liang; Zihao Zhang; Zidong Du; Qi Guo; Xing Hu; Yunji Chen

arXiv:2405.15414·cs.AI·May 27, 2024

Luban: Building Open-Ended Creative Agents via Autonomous Embodied Verification

Yuxuan Guo, Shaohui Peng, Jiaming Guo, Di Huang, Xishan Zhang, Rui, Zhang, Yifan Hao, Ling Li, Zikang Tian, Mingju Gao, Yutai Li, Yiming Gan,, Shuai Liang, Zihao Zhang, Zidong Du, Qi Guo, Xing Hu, Yunji Chen

PDF

Open Access

TL;DR

Luban introduces autonomous embodied verification techniques to enhance open-ended creative AI agents, enabling them to perform complex creative tasks in Minecraft and real-world robotics by mimicking human verification processes.

Contribution

The paper presents a novel autonomous verification framework with visual and pragmatic checks, significantly improving creative task performance in AI agents.

Findings

01

Luban outperforms baselines by 33% to 100% in creative building tasks.

02

Extensive human studies validate the effectiveness of the verification approach.

03

Demonstrations show potential for physical-world robotic creation.

Abstract

Building open agents has always been the ultimate goal in AI research, and creative agents are the more enticing. Existing LLM agents excel at long-horizon tasks with well-defined goals (e.g., `mine diamonds' in Minecraft). However, they encounter difficulties on creative tasks with open goals and abstract criteria due to the inability to bridge the gap between them, thus lacking feedback for self-improvement in solving the task. In this work, we introduce autonomous embodied verification techniques for agents to fill the gap, laying the groundwork for creative tasks. Specifically, we propose the Luban agent target creative building tasks in Minecraft, which equips with two-level autonomous embodied verification inspired by human design practices: (1) visual verification of 3D structural speculates, which comes from agent synthesized CAD modeling programs; (2) pragmatic verification of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games · Reinforcement Learning in Robotics · Human Motion and Animation