DVD: Discrete Voxel Diffusion for 3D Generation and Editing
Zhengrui Xiang, Jiaqi Wu, Fupeng Sun, Heliang Zheng, Yingzhen Li

TL;DR
DVD introduces a discrete diffusion framework for sparse voxel generation and editing, offering improved interpretability and uncertainty estimation in 3D generative pipelines.
Contribution
It presents a novel discrete diffusion method for sparse voxels, enabling efficient generation, editing, and uncertainty quantification in 3D models.
Findings
DVD achieves quality improvements over continuous diffusion methods.
Explicit categorical modeling enhances interpretability of voxel generation.
Predictive entropy effectively identifies ambiguous regions for filtering.
Abstract
We introduce Discrete Voxel Diffusion (DVD), a discrete diffusion framework to generate, assess, and edit sparse voxels for SLat (Structured LATent) based 3D generative pipelines. Although discrete diffusion has not generally displaced continuous diffusion in image-like generation, we show that it can be an effective first-stage prior for sparse voxel scaffolds. By treating voxel occupancy as a native discrete variable, DVD avoids continuous-to-discrete thresholding and provides a simple framework for voxel generation, uncertainty estimation, and editing. Beyond quality gains, DVD provides more interpretable generation dynamics through explicit categorical modeling. Furthermore, we leverage the predictive entropy as a robust uncertainty metric to identify ambiguous voxel regions and complicated samples, facilitating tasks such as data filtering and quality assessment. Finally, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
