PSGS: Text-driven Panorama Sliding Scene Generation via Gaussian Splatting
Xin Zhang, Shen Chen, Jiale Zhou, Lei Li

TL;DR
PSGS is a novel two-stage framework that generates high-fidelity panoramic 3D scenes from text, combining semantic reasoning and Gaussian Splatting to improve scene realism and consistency.
Contribution
It introduces a new two-layer optimization architecture and a panorama sliding mechanism for coherent 3D scene generation from text.
Findings
Outperforms existing panorama generation methods.
Produces more detailed and visually appealing 3D scenes.
Enhances scene coherence with depth and semantic losses.
Abstract
Generating realistic 3D scenes from text is crucial for immersive applications like VR, AR, and gaming. While text-driven approaches promise efficiency, existing methods suffer from limited 3D-text data and inconsistent multi-view stitching, resulting in overly simplistic scenes. To address this, we propose PSGS, a two-stage framework for high-fidelity panoramic scene generation. First, a novel two-layer optimization architecture generates semantically coherent panoramas: a layout reasoning layer parses text into structured spatial relationships, while a self-optimization layer refines visual details via iterative MLLM feedback. Second, our panorama sliding mechanism initializes globally consistent 3D Gaussian Splatting point clouds by strategically sampling overlapping perspectives. By incorporating depth and semantic coherence losses during training, we greatly improve the quality and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques
