Too Vivid to Be Real? Benchmarking and Calibrating Generative Color Fidelity
Zhengyao Fang, Zexi Jia, Yijia Zhong, Pengcheng Luo, Jinchao Zhang, Guangming Lu, Jun Yu, Wenjie Pei

TL;DR
This paper introduces a new dataset and metric for objectively evaluating and improving color fidelity in realistic text-to-image generation, addressing biases in existing evaluation methods that favor overly vivid images.
Contribution
It presents the Color Fidelity Dataset and Metric, a training-free refinement method, and a progressive framework for assessing and enhancing color realism in T2I generation.
Findings
CFD contains over 1.3 million images with varying color realism levels.
CFM employs a multimodal encoder to learn perceptual color fidelity.
CFR adaptively improves color authenticity without additional training.
Abstract
Recent advances in text-to-image (T2I) generation have greatly improved visual quality, yet producing images that appear visually authentic to real-world photography remains challenging. This is partly due to biases in existing evaluation paradigms: human ratings and preference-trained metrics often favor visually vivid images with exaggerated saturation and contrast, which make generations often too vivid to be real even when prompted for realistic-style images. To address this issue, we present Color Fidelity Dataset (CFD) and Color Fidelity Metric (CFM) for objective evaluation of color fidelity in realistic-style generations. CFD contains over 1.3M real and synthetic images with ordered levels of color realism, while CFM employs a multimodal encoder to learn perceptual color fidelity. In addition, we propose a training-free Color Fidelity Refinement (CFR) that adaptively modulates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Image Enhancement Techniques · Visual Attention and Saliency Detection
