SpaceDG: Benchmarking Spatial Intelligence under Visual Degradation

Xiaolong Zhou; Yifei Liu; Ziyang Gong; Jiarui Li; Qiyue Zhao; Muyao Niu; Yuanyuan Gao; Le Ma; Xue Yang; Hongjie Zhang; Zhihang Zhong

arXiv:2605.22536·cs.CV·May 22, 2026

SpaceDG: Benchmarking Spatial Intelligence under Visual Degradation

Xiaolong Zhou, Yifei Liu, Ziyang Gong, Jiarui Li, Qiyue Zhao, Muyao Niu, Yuanyuan Gao, Le Ma, Xue Yang, Hongjie Zhang, Zhihang Zhong

PDF

1 Repo 1 Datasets

TL;DR

This paper introduces SpaceDG, a large-scale dataset and benchmark for evaluating and improving the robustness of multimodal large language models in spatial reasoning under various visual degradations.

Contribution

It presents the first degradation-aware spatial understanding dataset and benchmark, along with findings on model robustness and the benefits of degradation-aware finetuning.

Findings

01

Visual degradations significantly impair spatial reasoning in MLLMs.

02

Finetuning on SpaceDG improves robustness and can surpass human performance on degraded images.

03

The dataset contains about 1 million QA pairs across nine degradation types.

Abstract

Multimodal Large Language Models (MLLMs) have made rapid progress in spatial intelligence, yet existing spatial reasoning benchmarks largely assume pristine visual inputs and overlook the degradations that commonly occur in real-world deployment, such as motion blur, low light, adverse weather, lens distortion, and compression artifacts. This raises a fundamental question: how robust is the spatial intelligence of current MLLMs when visual observations are imperfect? To answer this question, we introduce SpaceDG, the first large-scale dataset for degradation-aware spatial understanding. It is constructed with a physically grounded degradation synthesis engine that embeds degradation formation process into 3D Gaussian Splatting (3DGS) rendering, enabling realistic simulation of nine degradation types. The resulting dataset contains approximately 1M QA pairs from nearly 1,000 indoor…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

visionary-laboratory/SpaceDG
github

Datasets

xlzhou126/SpaceDG-Bench
dataset· 205 dl
205 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.