HiMat: DiT-based Ultra-High Resolution SVBRDF Generation
Zixiong Wang, Jian Yang, Yiwei Hu, Milos Hasan, Beibei Wang

TL;DR
HiMat is a diffusion-based framework that efficiently generates ultra-high-resolution 4K SVBRDFs with high fidelity, consistency, and diversity by operating in a compressed latent space and introducing a novel cross-map consistency module.
Contribution
The paper introduces HiMat, a novel diffusion-based approach with a latent space generation and CrossStitch module for efficient, consistent 4K SVBRDF synthesis, surpassing prior methods.
Findings
Achieves high-fidelity 4K SVBRDF generation with superior efficiency.
Maintains strong pixel-level alignment across maps.
Demonstrates generalization to related tasks like intrinsic decomposition.
Abstract
Creating ultra-high-resolution spatially varying bidirectional reflectance functions (SVBRDFs) is critical for photorealistic 3D content creation, to faithfully represent fine-scale surface details required for close-up rendering. However, achieving 4K generation faces two key challenges: (1) the need to synthesize multiple reflectance maps at full resolution, which multiplies the pixel budget and imposes prohibitive memory and computational cost, and (2) the requirement to maintain strong pixel-level alignment across maps at 4K, which is particularly difficult when adapting pretrained models designed for the RGB image domain. We introduce HiMat, a diffusion-based framework tailored for efficient and diverse 4K SVBRDF generation. To address the first challenge, HiMat performs generation in a high-compression latent space via DC-AE, and employs a pretrained diffusion transformer with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
