APT: Architectural Planning and Text-to-Blueprint Construction Using Large Language Models for Open-World Agents
Jun Yu Chen, Tao Gao

TL;DR
This paper introduces APT, a framework using large language models to enable autonomous agents to design and build complex structures in Minecraft, demonstrating advanced spatial reasoning, lifelong learning, and creative construction capabilities.
Contribution
The paper presents a novel LLM-driven approach for architectural planning and construction in open-world environments, incorporating multimodal inputs, memory modules, and a new benchmark for evaluation.
Findings
Agents accurately interpret complex instructions and build detailed structures.
Memory modules significantly improve construction performance.
Emergent scaffolding behaviors suggest advanced problem-solving abilities.
Abstract
We present APT, an advanced Large Language Model (LLM)-driven framework that enables autonomous agents to construct complex and creative structures within the Minecraft environment. Unlike previous approaches that primarily concentrate on skill-based open-world tasks or rely on image-based diffusion models for generating voxel-based structures, our method leverages the intrinsic spatial reasoning capabilities of LLMs. By employing chain-of-thought decomposition along with multimodal inputs, the framework generates detailed architectural layouts and blueprints that the agent can execute under zero-shot or few-shot learning scenarios. Our agent incorporates both memory and reflection modules to facilitate lifelong learning, adaptive refinement, and error correction throughout the building process. To rigorously evaluate the agent's performance in this emerging research area, we introduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies
MethodsDiffusion
