Point-E: A System for Generating 3D Point Clouds from Complex Prompts
Alex Nichol, Heewoo Jun, Prafulla Dhariwal, Pamela Mishkin, Mark Chen

TL;DR
Point-E introduces a fast method for generating 3D point clouds from complex prompts by combining text-to-image and image-to-3D diffusion models, significantly reducing generation time compared to previous approaches.
Contribution
The paper presents a novel two-step diffusion-based approach for 3D point cloud generation that is much faster than existing methods, enabling practical real-time applications.
Findings
Generates 3D point clouds in 1-2 minutes on a single GPU
Offers a practical trade-off with slightly lower quality than state-of-the-art methods
Provides open-source models and evaluation tools for the community
Abstract
While recent work on text-conditional 3D object generation has shown promising results, the state-of-the-art methods typically require multiple GPU-hours to produce a single sample. This is in stark contrast to state-of-the-art generative image models, which produce samples in a number of seconds or minutes. In this paper, we explore an alternative method for 3D object generation which produces 3D models in only 1-2 minutes on a single GPU. Our method first generates a single synthetic view using a text-to-image diffusion model, and then produces a 3D point cloud using a second diffusion model which conditions on the generated image. While our method still falls short of the state-of-the-art in terms of sample quality, it is one to two orders of magnitude faster to sample from, offering a practical trade-off for some use cases. We release our pre-trained point cloud diffusion models, as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis · Image Processing and 3D Reconstruction · 3D Surveying and Cultural Heritage
MethodsDiffusion
