TL;DR
TEXGen introduces a large-scale diffusion model trained directly in UV texture space, enabling high-resolution, text-guided, and versatile texture generation for 3D assets without relying on pre-trained 2D models.
Contribution
The paper presents the first large diffusion model trained directly in UV texture space, with a scalable architecture for high-res texture generation and multiple extended applications.
Findings
Successfully trained a 700M parameter diffusion model for UV textures.
Model supports text-guided inpainting, completion, and synthesis.
Achieves high-quality, high-resolution texture generation in UV space.
Abstract
While high-quality texture maps are essential for realistic 3D asset rendering, few studies have explored learning directly in the texture space, especially on large-scale datasets. In this work, we depart from the conventional approach of relying on pre-trained 2D diffusion models for test-time optimization of 3D textures. Instead, we focus on the fundamental problem of learning in the UV texture space itself. For the first time, we train a large diffusion model capable of directly generating high-resolution texture maps in a feed-forward manner. To facilitate efficient learning in high-resolution UV spaces, we propose a scalable network architecture that interleaves convolutions on UV maps with attention layers on point clouds. Leveraging this architectural design, we train a 700 million parameter diffusion model that can generate UV texture maps guided by text prompts and single-view…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSoftmax · Attention Is All You Need · Diffusion · Focus
