NuiWorld: Exploring a Scalable Framework for End-to-End Controllable World Generation
Han-Hung Lee, Cheng-Yu Yang, Yu-Lun Liu, Angel X. Chang

TL;DR
NuiWorld introduces a scalable, controllable framework for end-to-end world generation that synthesizes diverse scenes from limited data, leveraging 3D reconstruction, scene chunking, and pseudo sketch labels to improve fidelity and efficiency.
Contribution
The paper presents a novel bootstrapping strategy and scene representation that address data scarcity, scalability, and controllability in end-to-end world generation.
Findings
Effective scene synthesis from few input images.
Reduced token length enables scalable scene generation.
Demonstrated controllability via pseudo sketch labels.
Abstract
World generation is a fundamental capability for applications like video games, simulation, and robotics. However, existing approaches face three main obstacles: controllability, scalability, and efficiency. End-to-end scene generation models have been limited by data scarcity. While object-centric generation approaches rely on fixed resolution representations, degrading fidelity for larger scenes. Training-free approaches, while flexible, are often slow and computationally expensive at inference time. We present NuiWorld, a framework that attempts to address these challenges. To overcome data scarcity, we propose a generative bootstrapping strategy that starts from a few input images. Leveraging recent 3D reconstruction and expandable scene generation techniques, we synthesize scenes of varying sizes and layouts, producing enough data to train an end-to-end model. Furthermore, our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · 3D Shape Modeling and Analysis · Human Motion and Animation
