NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Yuxue Yang; Lue Fan; Ziqi Shi; Junran Peng; Feng Wang; Zhaoxiang Zhang

arXiv:2601.00393·cs.CV·March 27, 2026

NeoVerse: Enhancing 4D World Model with in-the-wild Monocular Videos

Yuxue Yang, Lue Fan, Ziqi Shi, Junran Peng, Feng Wang, Zhaoxiang Zhang

PDF

Open Access 1 Models

TL;DR

NeoVerse introduces a scalable 4D world model that leverages monocular videos for reconstruction and video generation, overcoming previous limitations of data requirements and pre-processing, and achieving state-of-the-art results.

Contribution

NeoVerse presents a novel framework enabling scalable 4D reconstruction and video generation from in-the-wild monocular videos, with pose-free processing and degradation simulation.

Findings

01

Achieves state-of-the-art performance in 4D reconstruction benchmarks.

02

Demonstrates versatility across diverse domains.

03

Operates effectively with monocular videos without extensive pre-processing.

Abstract

In this paper, we propose NeoVerse, a versatile 4D world model that is capable of 4D reconstruction, novel-trajectory video generation, and rich downstream applications. We first identify a common limitation of scalability in current 4D world modeling methods, caused either by expensive and specialized multi-view 4D data or by cumbersome training pre-processing. In contrast, our NeoVerse is built upon a core philosophy that makes the full pipeline scalable to diverse in-the-wild monocular videos. Specifically, NeoVerse features pose-free feed-forward 4D reconstruction, online monocular degradation pattern simulation, and other well-aligned techniques. These designs empower NeoVerse with versatility and generalization to various domains. Meanwhile, NeoVerse achieves state-of-the-art performance in standard reconstruction and generation benchmarks. Our project page is available at…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
Yuppie1204/NeoVerse
model· 1.5k dl· ♡ 4
1.5k dl♡ 4

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Generative Adversarial Networks and Image Synthesis · Advanced Image Processing Techniques