Structured 3D Latents for Scalable and Versatile 3D Generation

Jianfeng Xiang; Zelong Lv; Sicheng Xu; Yu Deng; Ruicheng Wang; Bowen Zhang; Dong Chen; Xin Tong; Jiaolong Yang

arXiv:2412.01506·cs.CV·June 2, 2025·5 cites

Structured 3D Latents for Scalable and Versatile 3D Generation

Jianfeng Xiang, Zelong Lv, Sicheng Xu, Yu Deng, Ruicheng Wang, Bowen Zhang, Dong Chen, Xin Tong, Jiaolong Yang

PDF

Open Access 2 Repos 10 Models 4 Datasets

TL;DR

This paper presents a unified 3D generation framework using Structured LATent (SLAT) representations that enable high-quality, versatile 3D asset creation across multiple formats with flexible editing capabilities.

Contribution

The introduction of SLAT as a unified, multi-format 3D representation combined with a large-scale transformer-based model for high-quality, conditional 3D generation is novel.

Findings

01

Outperforms existing 3D generation methods in quality and versatility.

02

Supports multiple output formats including radiance fields, 3D Gaussians, and meshes.

03

Enables local 3D editing not available in prior models.

Abstract

We introduce a novel 3D generation method for versatile and high-quality 3D asset creation. The cornerstone is a unified Structured LATent (SLAT) representation which allows decoding to different output formats, such as Radiance Fields, 3D Gaussians, and meshes. This is achieved by integrating a sparsely-populated 3D grid with dense multiview visual features extracted from a powerful vision foundation model, comprehensively capturing both structural (geometry) and textural (appearance) information while maintaining flexibility during decoding. We employ rectified flow transformers tailored for SLAT as our 3D generation models and train models with up to 2 billion parameters on a large 3D asset dataset of 500K diverse objects. Our model generates high-quality results with text or image conditions, significantly surpassing existing methods, including recent ones at similar scales. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComputer Graphics and Visualization Techniques · 3D Shape Modeling and Analysis · Image Processing and 3D Reconstruction