2.5-D Decomposition for LLM-Based Spatial Construction
Paul Whitten, Li-Jen Chen, Sharath Baddam

TL;DR
This paper introduces a 2.5-D decomposition approach that improves spatial reasoning in LLM-based autonomous construction, significantly reducing coordinate errors and enhancing accuracy in building tasks.
Contribution
The paper proposes a neuro-symbolic pipeline that separates horizontal planning from vertical execution, substantially improving accuracy and robustness in 3D construction tasks using LLMs.
Findings
Achieves 94.6% accuracy on the Build What I Mean benchmark.
Outperforms previous systems and approaches the theoretical accuracy ceiling.
Demonstrates successful transfer to edge hardware with minimal performance loss.
Abstract
Autonomous systems that build structures from natural-language instructions need reliable spatial reasoning, yet large language models (LLMs) make systematic coordinate errors when generating three-dimensional block placements. We present a neuro-symbolic pipeline based on \emph{2.5-D decomposition}: the LLM plans in the two-dimensional horizontal plane while a deterministic executor computes all vertical placement from column occupancy, eliminating an entire class of errors. On the Build What I Mean benchmark (160 rounds), GPT-4o-mini with this pipeline achieves 94.6\% mean structural accuracy across 12 independent runs, within 3.0 percentage points of the 97.6\% ceiling imposed by architect-agent errors that no builder-side improvement can address. This outperforms both GPT-4o at 90.3\% and the best competing system at 76.3\%. A controlled ablation confirms that 2.5-D decomposition is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
