RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards

Junyan Ye; Leiqi Zhu; Yuncheng Guo; Dongzhi Jiang; Zilong Huang; Yifan Zhang; Zhiyuan Yan; Haohuan Fu; Conghui He; Weijia Li

arXiv:2512.00473·cs.CV·December 2, 2025

RealGen: Photorealistic Text-to-Image Generation via Detector-Guided Rewards

Junyan Ye, Leiqi Zhu, Yuncheng Guo, Dongzhi Jiang, Zilong Huang, Yifan Zhang, Zhiyuan Yan, Haohuan Fu, Conghui He, Weijia Li

PDF

Open Access 1 Models

TL;DR

RealGen is a novel framework that enhances photorealistic text-to-image generation by integrating detector-guided rewards and prompt optimization, significantly improving image realism and detail over existing models.

Contribution

The paper introduces RealGen, combining a detector-guided reward mechanism with diffusion models and prompt optimization to achieve superior photorealism in text-to-image synthesis.

Findings

01

RealGen outperforms GPT-Image-1 and Qwen-Image in realism and detail.

02

The Detector Reward effectively quantifies and reduces artifacts.

03

RealBench provides a human-free, accurate photorealism evaluation.

Abstract

With the continuous advancement of image generation technology, advanced models such as GPT-Image-1 and Qwen-Image have achieved remarkable text-to-image consistency and world knowledge However, these models still fall short in photorealistic image generation. Even on simple T2I tasks, they tend to produce " fake" images with distinct AI artifacts, often characterized by "overly smooth skin" and "oily facial sheens". To recapture the original goal of "indistinguishable-from-reality" generation, we propose RealGen, a photorealistic text-to-image framework. RealGen integrates an LLM component for prompt optimization and a diffusion model for realistic image generation. Inspired by adversarial generation, RealGen introduces a "Detector Reward" mechanism, which quantifies artifacts and assesses realism using both semantic-level and feature-level synthetic image detectors. We leverage this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
lokiz666/Realgen-detection-models
model· 2 dl· ♡ 17
2 dl♡ 17

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Artificial Intelligence in Games