From Building Blocks to Planning: Multi-Step Spatial Reasoning in LLMs with Reinforcement Learning

Amir Tahmasbi; Sadegh Majidi; Kazem Taram; Aniket Bera

arXiv:2512.24532·cs.AI·January 1, 2026

From Building Blocks to Planning: Multi-Step Spatial Reasoning in LLMs with Reinforcement Learning

Amir Tahmasbi, Sadegh Majidi, Kazem Taram, Aniket Bera

PDF

Open Access

TL;DR

This paper introduces a two-stage method combining supervised fine-tuning and reinforcement learning to improve multi-step spatial reasoning in large language models, demonstrating superior performance and stability in puzzle environments.

Contribution

The paper presents a novel two-stage approach that decomposes spatial reasoning into atomic blocks and their composition, enhancing LLMs' multi-step planning abilities.

Findings

01

Outperforms baseline models in spatial reasoning tasks

02

Faster convergence and more stable training than end-to-end RL

03

Attention analysis shows improved spatial understanding

Abstract

Spatial reasoning in large language models (LLMs) has gained increasing attention due to applications in navigation and planning. Despite strong general language capabilities, LLMs still struggle with spatial transformations and multi-step planning in structured environments. We propose a two-stage approach that decomposes spatial reasoning into atomic building blocks and their composition. First, we apply supervised fine-tuning on elementary spatial transformations, such as rotation, translation, and scaling, to equip the model with basic spatial physics. We then freeze this physics-aware model and train lightweight LoRA adapters within the GRPO framework to learn policies that compose these building blocks for multi-step planning in puzzle-based environments, in a closed-loop manner. To support this pipeline, we synthesize an ASCII-art dataset and construct a corresponding ASCII-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Constraint Satisfaction and Optimization · AI-based Problem Solving and Planning