DiffuSyn Bench: Evaluating Vision-Language Models on Real-World Complexities with Diffusion-Generated Synthetic Benchmarks

Haokun Zhou; Yipeng Hong

arXiv:2406.04470·cs.CV·November 21, 2025

DiffuSyn Bench: Evaluating Vision-Language Models on Real-World Complexities with Diffusion-Generated Synthetic Benchmarks

Haokun Zhou, Yipeng Hong

PDF

Open Access

TL;DR

This paper introduces DiffuSyn Bench, an automated, diffusion-generated synthetic benchmark for evaluating vision-language models' ability to distinguish AI from human images, revealing their limitations and biases in real-world scenarios.

Contribution

It presents a novel automated benchmark construction method using diffusion models, enabling scalable evaluation of vision-language models on complex, real-world image datasets.

Findings

01

LVLMs can partially distinguish AI and human images

02

LVLMs perform worse than humans in this task

03

The benchmark construction method is scalable and automatic

Abstract

This study assesses the ability of Large Vision-Language Models (LVLMs) to differentiate between AI-generated and human-generated images. It introduces a new automated benchmark construction method for this evaluation. The experiment compared common LVLMs with human participants using a mixed dataset of AI and human-created images. Results showed that LVLMs could distinguish between the image types to some extent but exhibited a rightward bias, and perform significantly worse compared to humans. To build on these findings, we developed an automated benchmark construction process using AI. This process involved topic retrieval, narrative script generation, error embedding, and image generation, creating a diverse set of text-image pairs with intentional errors. We validated our method through constructing two caparable benchmarks. This study highlights the strengths and weaknesses of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · AI-based Problem Solving and Planning · Cognitive Science and Mapping

MethodsSparse Evolutionary Training