Pano3DComposer: Feed-Forward Compositional 3D Scene Generation from Single Panoramic Image
Zidian Qiu, Ancong Wu

TL;DR
Pano3DComposer is a fast, feed-forward framework that generates complete 3D scenes from single panoramic images, overcoming limitations of previous methods by improving efficiency and geometric accuracy.
Contribution
It introduces a novel Object-World Transformation Predictor and a Coarse-to-Fine alignment mechanism for better 3D scene generation from panoramic images.
Findings
Achieves superior geometric accuracy on synthetic and real datasets.
Generates high-fidelity 3D scenes in approximately 20 seconds.
Effectively handles unseen domains with iterative refinement.
Abstract
Current compositional image-to-3D scene generation approaches construct 3D scenes by time-consuming iterative layout optimization or inflexible joint object-layout generation. Moreover, most methods rely on limited field-of-view perspective images, hindering the creation of complete 360-degree environments. To address these limitations, we design Pano3DComposer, an efficient feed-forward framework for panoramic images. To decouple object generation from layout estimation, we propose a plug-and-play Object-World Transformation Predictor. This module converts the 3D objects generated by off-the-shelf image-to-3D models from local to world coordinates. To achieve this, we adapt the VGGT architecture to Alignment-VGGT by using target object crop, multi-view object renderings and camera parameters to predict the transformation. The predictor is trained using pseudo-geometric supervision to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Generative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques
