MineAnyBuild: Benchmarking Spatial Planning for Open-world AI Agents

Ziming Wei; Bingqian Lin; Zijian Jiao; Yunshuang Nie; Liang Ma; Yuecheng Liu; Yuzheng Zhuang; Xiaodan Liang

arXiv:2505.20148·cs.AI·September 30, 2025

MineAnyBuild: Benchmarking Spatial Planning for Open-world AI Agents

Ziming Wei, Bingqian Lin, Zijian Jiao, Yunshuang Nie, Liang Ma, Yuecheng Liu, Yuzheng Zhuang, Xiaodan Liang

PDF

Open Access 1 Repo 1 Video

TL;DR

MineAnyBuild is a comprehensive benchmark designed to evaluate the spatial planning abilities of open-world AI agents in Minecraft, focusing on understanding, reasoning, creativity, and commonsense, revealing current limitations and future potential.

Contribution

The paper introduces MineAnyBuild, a novel, expandable benchmark for assessing spatial planning in AI agents within a complex, open-world environment like Minecraft.

Findings

01

Existing MLLM agents show significant limitations in spatial planning.

02

MineAnyBuild reveals the potential for improving AI spatial reasoning.

03

The benchmark supports large-scale, multi-modal, and diverse spatial tasks.

Abstract

Spatial Planning is a crucial part in the field of spatial intelligence, which requires the understanding and planning about object arrangements in space perspective. AI agents with the spatial planning ability can better adapt to various real-world applications, including robotic manipulation, automatic assembly, urban planning etc. Recent works have attempted to construct benchmarks for evaluating the spatial intelligence of Multimodal Large Language Models (MLLMs). Nevertheless, these benchmarks primarily focus on spatial reasoning based on typical Visual Question-Answering (VQA) forms, which suffers from the gap between abstract spatial understanding and concrete task execution. In this work, we take a step further to build a comprehensive benchmark called MineAnyBuild, aiming to evaluate the spatial planning ability of open-world AI agents in the Minecraft game. Specifically,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mineanybuild/mineanybuild
pytorchOfficial

Videos

MineAnyBuild: Benchmarking Spatial Planning for Open-world AI Agents· slideslive

Taxonomy

TopicsMulti-Agent Systems and Negotiation

MethodsFocus