PrITTI: Primitive-based Generation of Controllable and Editable 3D Semantic Urban Scenes
Christina Ourania Tze, Daniel Dauner, Yiyi Liao, Dzmitry Tsishkou, Andreas Geiger

TL;DR
PrITTI introduces a primitive-based 3D scene generation method using a latent diffusion model, enabling controllable, editable, and high-quality urban scene synthesis with lower memory use compared to voxel-based approaches.
Contribution
The paper presents PrITTI, a novel primitive-based representation and diffusion model for 3D urban scene generation, offering improved editability and efficiency over traditional voxel-based methods.
Findings
Achieves state-of-the-art 3D scene quality on KITTI-360
Requires less memory and inference time than voxel-based methods
Enables diverse scene editing and downstream applications
Abstract
Existing approaches to 3D semantic urban scene generation predominantly rely on voxel-based representations, which are bound by fixed resolution, challenging to edit, and memory-intensive in their dense form. In contrast, we advocate for a primitive-based paradigm where urban scenes are represented using compact, semantically meaningful 3D elements that are easy to manipulate and compose. To this end, we introduce PrITTI, a latent diffusion model that leverages vectorized object primitives and rasterized ground surfaces for generating diverse, controllable, and editable 3D semantic urban scenes. This hybrid representation yields a structured latent space that facilitates object- and ground-level manipulation. Experiments on KITTI-360 show that primitive-based representations unlock the full capabilities of diffusion transformers, achieving state-of-the-art 3D scene generation quality…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Surveying and Cultural Heritage · Image Processing and 3D Reconstruction · 3D Shape Modeling and Analysis
