LoopNav: Benchmarking Spatial Consistency in World Models

Kewei Lian; Shaofei Cai; Yitao Liang; Anji Liu

arXiv:2505.22976·cs.CV·May 11, 2026

LoopNav: Benchmarking Spatial Consistency in World Models

Kewei Lian, Shaofei Cai, Yitao Liang, Anji Liu

PDF

TL;DR

LoopNav introduces a new dataset and benchmark for evaluating spatial consistency in world models, emphasizing long-range spatial coherence in open-world navigation scenarios.

Contribution

It provides a large-scale dataset and a novel Scene Graph Consistency Score to systematically assess spatial consistency in world models.

Findings

01

250 hours of Minecraft navigation videos collected

02

Scene Graph Consistency Score effectively measures spatial coherence

03

Open-sourced dataset, benchmark, and code for future research

Abstract

The ability to simulate the world in a spatially consistent manner is a crucial requirement for effective world models. Such a model enables high-quality visual generation, and also ensures the reliability of world models for downstream tasks such as simulation and planning. It must not only retain long-horizon observational information, but also enables the construction of explicit or implicit internal spatial representations. However, existing datasets do not explicitly enforce spatial consistency constraints, limiting both the ability to systematically evaluate this capability and to learn it through data-driven approaches. Furthermore, most existing benchmarks primarily emphasize visual coherence or generation quality, neglecting the requirement of long-range spatial consistency. To bridge this gap, we propose LoopNav, a dataset and corresponding benchmark centered on loop-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.