RGB$\leftrightarrow$X: Image decomposition and synthesis using material- and lighting-aware diffusion models
Zheng Zeng, Valentin Deschaintre, Iliyan Georgiev, Yannick, Hold-Geoffroy, Yiwei Hu, Fujun Luan, Ling-Qi Yan, Milo\v{s} Ha\v{s}an

TL;DR
This paper introduces advanced diffusion models for bidirectional image decomposition and synthesis, enabling realistic interior scene rendering and intrinsic property estimation from limited data, bridging graphics and vision fields.
Contribution
It presents the first diffusion models for both RGB to intrinsic channels and intrinsic channels to RGB, allowing flexible, high-quality interior scene image synthesis and property estimation.
Findings
Improved intrinsic property estimation from limited data.
High realism in synthesized interior scene images.
Effective use of heterogeneous datasets for training.
Abstract
The three areas of realistic forward rendering, per-pixel inverse rendering, and generative image synthesis may seem like separate and unrelated sub-fields of graphics and vision. However, recent work has demonstrated improved estimation of per-pixel intrinsic channels (albedo, roughness, metallicity) based on a diffusion architecture; we call this the RGBX problem. We further show that the reverse problem of synthesizing realistic images given intrinsic channels, XRGB, can also be addressed in a diffusion framework. Focusing on the image domain of interior scenes, we introduce an improved diffusion model for RGBX, which also estimates lighting, as well as the first diffusion XRGB model capable of synthesizing realistic images from (full or partial) intrinsic channels. Our XRGB model explores a middle ground between…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDiffusion
