SCube: Instant Large-Scale Scene Reconstruction using VoxSplats
Xuanchi Ren, Yifan Lu, Hanxue Liang, Zhangjie Wu, Huan Ling, Mike, Chen, Sanja Fidler, Francis Williams, Jiahui Huang

TL;DR
SCube is a fast, high-resolution 3D scene reconstruction method that uses a novel VoxSplat representation and diffusion models to generate detailed scenes from minimal images, outperforming prior approaches.
Contribution
Introduces VoxSplat, a new sparse voxel-based scene representation, and a hierarchical diffusion-based reconstruction pipeline for large-scale scenes from few images.
Findings
Reconstructs scenes with millions of Gaussians in 20 seconds from 3 images
Produces sharper, high-resolution 3D scenes compared to prior methods
Demonstrates applications in LiDAR simulation and text-to-scene generation
Abstract
We present SCube, a novel method for reconstructing large-scale 3D scenes (geometry, appearance, and semantics) from a sparse set of posed images. Our method encodes reconstructed scenes using a novel representation VoxSplat, which is a set of 3D Gaussians supported on a high-resolution sparse-voxel scaffold. To reconstruct a VoxSplat from images, we employ a hierarchical voxel latent diffusion model conditioned on the input images followed by a feedforward appearance prediction model. The diffusion model generates high-resolution grids progressively in a coarse-to-fine manner, and the appearance network predicts a set of Gaussians within each voxel. From as few as 3 non-overlapping input images, SCube can generate millions of Gaussians with a 1024^3 voxel grid spanning hundreds of meters in 20 seconds. Past works tackling scene reconstruction from images either rely on per-scene…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputer Graphics and Visualization Techniques · Medical Image Segmentation Techniques · Advanced Vision and Imaging
MethodsDiffusion · Sparse Evolutionary Training · Latent Diffusion Model
