GeRM: A Generative Rendering Model From Physically Realistic to Photorealistic

Jiayuan Lu; Rengan Xie; Xuancheng Jin; Zhizhen Wu; Qi Ye; Tian Xie; Hujun Bao; Rui Wang. Yuchi Huo

arXiv:2604.09304·cs.CV·May 15, 2026

GeRM: A Generative Rendering Model From Physically Realistic to Photorealistic

Jiayuan Lu, Rengan Xie, Xuancheng Jin, Zhizhen Wu, Qi Ye, Tian Xie, Hujun Bao, Rui Wang. Yuchi Huo

PDF

TL;DR

GeRM is a novel multimodal generative rendering model that effectively transitions physically-based rendering images into photorealistic images using a learned distribution transfer mechanism and multi-condition guidance.

Contribution

It introduces a new P2P transition framework with a distribution transfer vector, a multi-condition ControlNet, and a residual perceptual transfer mechanism for photorealistic rendering.

Findings

01

Outperforms state-of-the-art methods in image synthesis and editing.

02

Synthesizes high-quality controllable images from PBR to PRR.

03

Demonstrates versatility across diverse rendering applications.

Abstract

While physically-based rendering (PBR) simulates light transport that guarantees physical realism, achieving true photorealistic rendering (PRR) demands prohibitive time and labor, and still struggles to capture the intractable richness of the real world. We propose GeRM, the first multimodal generative rendering model to bridge the gap from PBR to PRR (P2P). We formulate this P2P transition by learning a distribution transfer vector (DTV) field to direct the generative process. To achieve this, we introduce a multi-condition ControlNet that synthesizes PBR images and progressively transitions them into PRR images, guided by G-buffers, text prompts, and cues for enhanced regions. To improve the model's grasp of the image distribution shift driven by text prompts, we propose a residual perceptual transfer mechanism to associate text prompts with corresponding targeted modification…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.