GaussianGPT: Towards Autoregressive 3D Gaussian Scene Generation

Nicolas von L\"utzow; Barbara R\"ossle; Katharina Schmid; Matthias Nie{\ss}ner

arXiv:2603.26661·cs.CV·March 30, 2026

GaussianGPT: Towards Autoregressive 3D Gaussian Scene Generation

Nicolas von L\"utzow, Barbara R\"ossle, Katharina Schmid, Matthias Nie{\ss}ner

PDF

TL;DR

GaussianGPT introduces an autoregressive transformer model that generates 3D scenes by predicting sequences of Gaussian primitives, enabling controllable and step-by-step scene synthesis.

Contribution

It presents a novel autoregressive approach for 3D scene generation using Gaussian primitives and a transformer-based model, contrasting with diffusion methods.

Findings

01

Supports scene completion and outpainting.

02

Enables controllable sampling with temperature.

03

Operates efficiently on explicit 3D representations.

Abstract

Most recent advances in 3D generative modeling rely on diffusion or flow-matching formulations. We instead explore a fully autoregressive alternative and introduce GaussianGPT, a transformer-based model that directly generates 3D Gaussians via next-token prediction, thus facilitating full 3D scene generation. We first compress Gaussian primitives into a discrete latent grid using a sparse 3D convolutional autoencoder with vector quantization. The resulting tokens are serialized and modeled using a causal transformer with 3D rotary positional embedding, enabling sequential generation of spatial structure and appearance. Unlike diffusion-based methods that refine scenes holistically, our formulation constructs scenes step-by-step, naturally supporting completion, outpainting, controllable sampling via temperature, and flexible generation horizons. This formulation leverages the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.