VideoMat: Extracting PBR Materials from Video Diffusion Models
Jacob Munkberg, Zian Wang, Ruofan Liang, Tianchang Shen, Jon Hasselgren

TL;DR
VideoMat combines advanced video diffusion, intrinsic decomposition, and differentiable rendering to generate and extract high-quality PBR materials for 3D models from minimal inputs like text or images.
Contribution
It introduces a novel pipeline that conditions video diffusion models on geometry and lighting, enabling coherent multi-view material generation and extraction from videos.
Findings
Produces high-quality, view-consistent materials for 3D models
Enables extraction of PBR materials directly from videos
Integrates diffusion models with differentiable rendering for material synthesis
Abstract
We leverage finetuned video diffusion models, intrinsic decomposition of videos, and physically-based differentiable rendering to generate high quality materials for 3D models given a text prompt or a single image. We condition a video diffusion model to respect the input geometry and lighting condition. This model produces multiple views of a given 3D model with coherent material properties. Secondly, we use a recent model to extract intrinsics (base color, roughness, metallic) from the generated video. Finally, we use the intrinsics alongside the generated video in a differentiable path tracer to robustly extract PBR materials directly compatible with common content creation tools.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · Digital Humanities and Scholarship
MethodsDiffusion
