ImmerseGen: Agent-Guided Immersive World Generation with Alpha-Textured Proxies
Jinyan Yuan, Bangbang Yang, Keke Wang, Panwang Pan, Lin Ma, Xuehai Zhang, Xiao Liu, Zhaopeng Cui, Yuewen Ma

TL;DR
ImmerseGen is an innovative agent-guided framework that creates photorealistic, efficient VR worlds by using lightweight proxies and multimodal scene enhancement, enabling real-time rendering on mobile devices.
Contribution
The paper introduces a novel hierarchical scene representation with semantic analysis and multimodal enhancements, decoupling realism from complex geometry for efficient VR scene generation.
Findings
Achieves higher photorealism and spatial coherence than existing methods.
Enables real-time rendering on mobile VR headsets.
Provides diverse, visually coherent virtual worlds with multimodal features.
Abstract
Automating immersive VR scene creation remains a primary research challenge. Existing methods typically rely on complex geometry with post-simplification, resulting in inefficient pipelines or limited realism. In this paper, we introduce ImmerseGen, a novel agent-guided framework for compact and photorealistic world generation that decouples realism from exhaustive geometric modeling. ImmerseGen represents scenes as hierarchical compositions of lightweight geometric proxies with synthesized RGBA textures, facilitating real-time rendering on mobile VR headsets. We propose terrain-conditioned texturing for base world generation, combined with context-aware texturing for scenery, to produce diverse and visually coherent worlds. VLM-based agents employ semantic grid-based analysis for precise asset placement and enrich scenes with multimodal enhancements such as visual dynamics and ambient…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Artificial Intelligence in Games · Computer Graphics and Visualization Techniques
MethodsFocus · Balanced Selection
