SceneFoundry: Generating Interactive Infinite 3D Worlds

ChunTeng Chen; YiChen Hsu; YiWen Liu; WeiFang Sun; TsaiChing Ni; ChunYi Lee; Min Sun; and YuanFu Yang

arXiv:2601.05810·cs.CV·January 19, 2026

SceneFoundry: Generating Interactive Infinite 3D Worlds

ChunTeng Chen, YiChen Hsu, YiWen Liu, WeiFang Sun, TsaiChing Ni, ChunYi Lee, Min Sun, and YuanFu Yang

PDF

Open Access

TL;DR

SceneFoundry is a novel framework that automatically generates large, interactive 3D environments with articulated furniture and diverse layouts, facilitating robotic training and embodied AI research.

Contribution

It introduces a language-guided diffusion approach with differentiable guidance for creating functionally realistic and navigable 3D worlds from natural language prompts.

Findings

01

Generates structurally valid and semantically coherent environments

02

Enables scalable robotic training in diverse scene types

03

Maintains physical usability with articulated objects and navigable spaces

Abstract

The ability to automatically generate large-scale, interactive, and physically realistic 3D environments is crucial for advancing robotic learning and embodied intelligence. However, existing generative approaches often fail to capture the functional complexity of real-world interiors, particularly those containing articulated objects with movable parts essential for manipulation and navigation. This paper presents SceneFoundry, a language-guided diffusion framework that generates apartment-scale 3D worlds with functionally articulated furniture and semantically diverse layouts for robotic training. From natural language prompts, an LLM module controls floor layout generation, while diffusion-based posterior sampling efficiently populates the scene with articulated assets from large-scale 3D repositories. To ensure physical usability, SceneFoundry employs differentiable guidance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSocial Robot Interaction and HRI · Multimodal Machine Learning Applications · Robot Manipulation and Learning