Learning to Draw ASCII Improves Spatial Reasoning in Language Models

Shiyuan Huang; Li Liu; Jincheng He; Leilani H. Gilpin

arXiv:2604.14641·cs.AI·April 17, 2026

Learning to Draw ASCII Improves Spatial Reasoning in Language Models

Shiyuan Huang, Li Liu, Jincheng He, Leilani H. Gilpin

PDF

1 Datasets

TL;DR

Training language models to construct explicit ASCII spatial layouts from descriptions enhances their spatial reasoning abilities, with benefits transferring to external benchmarks.

Contribution

We introduce Text2Space, a dataset for training models on ASCII layout construction, improving spatial reasoning in language models without requiring ASCII output at inference.

Findings

01

ASCII layout construction training improves spatial reasoning accuracy.

02

Models trained on layout construction transfer gains to external spatial benchmarks.

03

Learning to draw ASCII layouts enhances models' spatial understanding beyond training data.

Abstract

When faced with complex spatial problems, humans naturally sketch layouts to organize their thinking, and the act of drawing further sharpens their understanding. In this work, we ask whether a similar principle holds for Large Language Models (LLMs): can learning to construct explicit visual layouts from spatial descriptions instill genuine spatial understanding? We introduce Text2Space, a dataset that pairs natural language descriptions with ground-truth ASCII grid layouts and spatial QA pairs, enabling us to separate failures in constructing spatial representations from failures in reasoning over them. We adopt ASCII because it is human-readable, operates entirely within the token space of language models, and encodes spatial relations in a structurally verifiable form. Our evaluation reveals a pronounced "Read-Write Asymmetry": LLMs interpret ASCII representations effectively but…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

ShiyuanHuang/Text2Space
dataset· 109 dl
109 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.