Realiz3D: 3D Generation Made Photorealistic via Domain-Aware Learning
Ido Sobol, Kihyuk Sohn, Yoav Blum, Egor Zakharov, Max Bluvstein, Andrea Vedaldi, Or Litany

TL;DR
Realiz3D introduces a diffusion model framework that separates control signals from visual domain learning, enabling photorealistic, 3D-consistent image generation from synthetic controls.
Contribution
The paper proposes a novel domain-aware diffusion model that decouples control signals from visual domain learning, improving realism and controllability in 3D image synthesis.
Findings
Produces photorealistic, 3D-consistent images from synthetic controls.
Enhances control transferability to real images.
Outperforms existing methods in text-to-multiview and 3D texturing tasks.
Abstract
We often aim to generate images that are both photorealistic and 3D-consistent, adhering to precise geometry, material, and viewpoint controls. Typically, this is achieved by fine-tuning an image generator, pre-trained on billions of real images, using renders of synthetic 3D assets, where annotations for control signals are available. While this approach can learn the desired controls, it often compromises the realism of the images due to domain gap between photographs and renders. We observe that this issue largely arises from the model learning an unintended association between the presence of control signals and the synthetic appearance of the images. To address this, we introduce Realiz3D, a lightweight framework for training diffusion models, that decouples controls and visual domain. The key idea is to explicitly learn visual domain, real or synthetic, separately from other…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
