Wonderful Team: Zero-Shot Physical Task Planning with Visual LLMs

Zidan Wang; Rui Shen; Bradly Stadie

arXiv:2407.19094·cs.AI·February 5, 2025·2 cites

Wonderful Team: Zero-Shot Physical Task Planning with Visual LLMs

Zidan Wang, Rui Shen, Bradly Stadie

PDF

Open Access 1 Repo

TL;DR

Wonderful Team leverages Vision Large Language Models for zero-shot high-level robotic planning directly from environment images, outperforming previous methods by integrating perception, control, and planning.

Contribution

The paper introduces a novel multi-agent VLLM framework for zero-shot robotic planning that eliminates the need for separate vision systems, enabling more integrated and effective high-level task execution.

Findings

01

40% success rate improvement on VimaBench

02

30% improvement over Trajectory Generators on drawing and wiping tasks

03

70% improvement on semantic reasoning tasks with linguistic constraints

Abstract

We introduce Wonderful Team, a multi-agent Vision Large Language Model (VLLM) framework for executing high-level robotic planning in a zero-shot regime. In our context, zero-shot high-level planning means that for a novel environment, we provide a VLLM with an image of the robot's surroundings and a task description, and the VLLM outputs the sequence of actions necessary for the robot to complete the task. Unlike previous methods for high-level visual planning for robotic manipulation, our method uses VLLMs for the entire planning process, enabling a more tightly integrated loop between perception, control, and planning. As a result, Wonderful Team's performance on real-world semantic and physical planning tasks often exceeds methods that rely on separate vision systems. For example, we see an average 40% success rate improvement on VimaBench over prior methods such as NLaP, an average…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wonderful-team-robotics/wonderful_team_robotics
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Automated Systems

MethodsSparse Evolutionary Training