Diffusion Models in 3D Vision: A Survey
Zhen Wang, Dongyuan Li, Yaozu Wu, Tianyu He, Jiang Bian, Renhe Jiang

TL;DR
This survey reviews how diffusion models are increasingly applied to 3D vision tasks, highlighting their mathematical foundations, recent advancements, challenges, and future directions in the field.
Contribution
It provides a comprehensive overview of diffusion models in 3D vision, including mathematical principles, architectural innovations, and key challenges, serving as a foundation for future research.
Findings
Diffusion models improve 3D object generation and reconstruction.
Handling occlusions and point density variations remains challenging.
Potential for large-scale pretraining to enhance 3D diffusion models.
Abstract
In recent years, 3D vision has become a crucial field within computer vision, powering a wide range of applications such as autonomous driving, robotics, augmented reality, and medical imaging. This field relies on accurate perception, understanding, and reconstruction of 3D scenes from 2D images or text data sources. Diffusion models, originally designed for 2D generative tasks, offer the potential for more flexible, probabilistic methods that can better capture the variability and uncertainty present in real-world 3D data. In this paper, we review the state-of-the-art methods that use diffusion models for 3D visual tasks, including but not limited to 3D object generation, shape completion, point-cloud reconstruction, and scene construction. We provide an in-depth discussion of the underlying mathematical principles of diffusion models, outlining their forward and reverse processes, as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMedical Image Segmentation Techniques
MethodsDiffusion
