D-Judge: How Far Are We? Assessing the Discrepancies Between AI-synthesized and Natural Images through Multimodal Guidance

Renyang Liu; Ziyu Lyu; Wei Zhou; See-Kiong Ng

arXiv:2412.17632·cs.AI·August 12, 2025

D-Judge: How Far Are We? Assessing the Discrepancies Between AI-synthesized and Natural Images through Multimodal Guidance

Renyang Liu, Ziyu Lyu, Wei Zhou, See-Kiong Ng

PDF

Open Access 1 Repo 1 Datasets

TL;DR

This paper introduces D-Judge, a benchmark and dataset to evaluate how closely AI-generated images resemble natural images across multiple dimensions, revealing significant discrepancies and emphasizing the need for better alignment with human perception.

Contribution

The paper presents a large-scale multimodal dataset and a comprehensive evaluation framework to systematically assess discrepancies between AI-synthesized and natural images.

Findings

01

Substantial differences found across visual quality, semantics, and aesthetics.

02

Alignment with human judgment is crucial for accurate AI image assessment.

03

The dataset and benchmark facilitate future research in improving AI-generated image realism.

Abstract

In the rapidly evolving field of Artificial Intelligence Generated Content (AIGC), a central challenge is distinguishing AI-synthesized images from natural ones. Despite the impressive capabilities of advanced generative models in producing visually compelling images, significant discrepancies remain when compared to natural images. To systematically investigate and quantify these differences, we construct a large-scale multimodal dataset, D-ANI, comprising 5,000 natural images and over 440,000 AIGI samples generated by nine representative models using both unimodal and multimodal prompts, including Text-to-Image (T2I), Image-to-Image (I2I), and Text-and-Image-to-Image (TI2I). We then introduce an AI-Natural Image Discrepancy assessment benchmark (D-Judge) to address the critical question: how far are AI-generated images (AIGIs) from truly realistic images? Our fine-grained evaluation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ryliu68/anid
pytorchOfficial

Datasets

Renyang/DANI
dataset· 130 dl
130 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsKnowledge Management and Technology · Artificial Intelligence in Healthcare and Education