Spatial Reasoning with Denoising Models

Christopher Wewer; Bart Pogodzinski; Bernt Schiele; Jan Eric Lenssen

arXiv:2502.21075·cs.CV·June 12, 2025

Spatial Reasoning with Denoising Models

Christopher Wewer, Bart Pogodzinski, Bernt Schiele, Jan Eric Lenssen

PDF

1 Video

TL;DR

This paper introduces Spatial Reasoning Models (SRMs), a novel framework using denoising generative models for reasoning over continuous spatial variables, addressing hallucination issues and improving reasoning accuracy significantly.

Contribution

The paper presents SRMs as a new approach for spatial reasoning with generative models, including a benchmark for evaluating reasoning quality and insights into generation order and sampling strategies.

Findings

01

Order of generation can be predicted by the denoising network

02

Reasoning accuracy improved from less than 1% to over 50%

03

Benchmark datasets and tools are provided for future research

Abstract

We introduce Spatial Reasoning Models (SRMs), a framework to perform reasoning over sets of continuous variables via denoising generative models. SRMs infer continuous representations on a set of unobserved variables, given observations on observed variables. Current generative models on spatial domains, such as diffusion and flow matching models, often collapse to hallucination in case of complex distributions. To measure this, we introduce a set of benchmark tasks that test the quality of complex reasoning in generative models and can quantify hallucination. The SRM framework allows to report key findings about importance of sequentialization in generation, the associated order, as well as the sampling strategies during training. It demonstrates, for the first time, that order of generation can successfully be predicted by the denoising network itself. Using these findings, we can…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Spatial Reasoning with Denoising Models· slideslive