Consistent Mesh Diffusion
Julian Knodt, Xifeng Gao

TL;DR
This paper introduces a fast, consistent method for generating textured 3D meshes from text prompts using a single Depth-to-Image diffusion network, improving over prior slow and inconsistent approaches.
Contribution
It proposes a novel approach that unifies multiple diffusion paths to produce consistent textures on 3D meshes efficiently.
Findings
Achieves approximately 5 minutes per mesh processing time.
Outperforms prior methods in CLIP-score and FID evaluations.
Demonstrates high-quality, view-consistent textures on 30 meshes.
Abstract
Given a 3D mesh with a UV parameterization, we introduce a novel approach to generating textures from text prompts. While prior work uses optimization from Text-to-Image Diffusion models to generate textures and geometry, this is slow and requires significant compute resources. Alternatively, there are projection based approaches that use the same Text-to-Image models that paint images onto a mesh, but lack consistency at different viewing angles, we propose a method that uses a single Depth-to-Image diffusion network, and generates a single consistent texture when rendered on the 3D surface by first unifying multiple 2D image's diffusion paths, and hoisting that to 3D with MultiDiffusion~\cite{multidiffusion}. We demonstrate our approach on a dataset containing 30 meshes, taking approximately 5 minutes per mesh. To evaluate the quality of our approach, we use…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputer Graphics and Visualization Techniques · 3D Shape Modeling and Analysis · Advanced Vision and Imaging
MethodsDiffusion
