Tokenizing Buildings: A Transformer for Layout Synthesis

Manuel Ladron de Guevara; Jinmo Rhee; Ardavan Bidgoli; Vaidas Razgaitis; Michael Bergin

arXiv:2512.04832·cs.CV·April 8, 2026

Tokenizing Buildings: A Transformer for Layout Synthesis

Manuel Ladron de Guevara, Jinmo Rhee, Ardavan Bidgoli, Vaidas Razgaitis, Michael Bergin

PDF

TL;DR

This paper presents SBM, a Transformer-based model for layout synthesis in BIM scenes, unifying architectural features into sequences for improved retrieval and generation of building layouts.

Contribution

The paper introduces a novel tokenization method for architectural elements and a Transformer architecture that enhances layout synthesis and semantic retrieval in BIM.

Findings

01

SBM learns compact, semantically meaningful room embeddings.

02

SBM outperforms baselines in generating functionally sound layouts.

03

SBM achieves fewer collisions and better navigability in generated layouts.

Abstract

We introduce Small Building Model (SBM), a Transformer-based architecture for layout synthesis in Building Information Modeling (BIM) scenes. We address the question of how to tokenize buildings by unifying heterogeneous feature sets of architectural elements into sequences while preserving compositional structure. Such feature sets are represented as a sparse attribute-feature matrix that captures room properties. We then design a unified embedding module that learns joint representations of categorical and possibly correlated continuous feature groups. Lastly, we train a single Transformer backbone in two modes: an encoder-only pathway that yields high-fidelity room embeddings, and an encoder-decoder pipeline for autoregressive prediction of residential room entities, referred to as Data-Driven Entity Prediction (DDEP). Experiments across retrieval and generative layout synthesis show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.