LoopNav: Benchmarking Spatial Consistency in World Models
Kewei Lian, Shaofei Cai, Yitao Liang, Anji Liu

TL;DR
LoopNav introduces a new dataset and benchmark for evaluating spatial consistency in world models, emphasizing long-range spatial coherence in open-world navigation scenarios.
Contribution
It provides a large-scale dataset and a novel Scene Graph Consistency Score to systematically assess spatial consistency in world models.
Findings
250 hours of Minecraft navigation videos collected
Scene Graph Consistency Score effectively measures spatial coherence
Open-sourced dataset, benchmark, and code for future research
Abstract
The ability to simulate the world in a spatially consistent manner is a crucial requirement for effective world models. Such a model enables high-quality visual generation, and also ensures the reliability of world models for downstream tasks such as simulation and planning. It must not only retain long-horizon observational information, but also enables the construction of explicit or implicit internal spatial representations. However, existing datasets do not explicitly enforce spatial consistency constraints, limiting both the ability to systematically evaluate this capability and to learn it through data-driven approaches. Furthermore, most existing benchmarks primarily emphasize visual coherence or generation quality, neglecting the requirement of long-range spatial consistency. To bridge this gap, we propose LoopNav, a dataset and corresponding benchmark centered on loop-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
