Seeing is not always believing: Benchmarking Human and Model Perception   of AI-Generated Images

Zeyu Lu; Di Huang; Lei Bai; Jingjing Qu; Chengyue Wu; Xihui Liu; Wanli; Ouyang

arXiv:2304.13023·cs.AI·September 26, 2023·31 cites

Seeing is not always believing: Benchmarking Human and Model Perception of AI-Generated Images

Zeyu Lu, Di Huang, Lei Bai, Jingjing Qu, Chengyue Wu, Xihui Liu, Wanli, Ouyang

PDF

Open Access 3 Datasets 1 Video

TL;DR

This paper benchmarks human and AI capabilities in distinguishing real photos from AI-generated images, revealing humans struggle significantly while top AI detectors perform better, highlighting the need for improved fake image detection.

Contribution

It introduces a large-scale fake image dataset and benchmarks for human and AI detection capabilities, providing insights into current limitations and performance gaps.

Findings

01

Humans have a 38.7% misclassification rate in distinguishing real from AI-generated images.

02

Top AI detection models have a 13% failure rate under the same evaluation conditions.

03

The study raises awareness of risks associated with AI-generated fake images.

Abstract

Photos serve as a way for humans to record what they experience in their daily lives, and they are often regarded as trustworthy sources of information. However, there is a growing concern that the advancement of artificial intelligence (AI) technology may produce fake photos, which can create confusion and diminish trust in photographs. This study aims to comprehensively evaluate agents for distinguishing state-of-the-art AI-generated visual content. Our study benchmarks both human capability and cutting-edge fake image detection AI algorithms, using a newly collected large-scale fake image dataset Fake2M. In our human perception evaluation, titled HPBench, we discovered that humans struggle significantly to distinguish real photos from AI-generated ones, with a misclassification rate of 38.7%. Along with this, we conduct the model capability of AI-Generated images detection evaluation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Videos

Seeing is not always believing: Benchmarking Human and Model Perception of AI-Generated Images· slideslive

Taxonomy

TopicsMisinformation and Its Impacts · Generative Adversarial Networks and Image Synthesis · Digital Media Forensic Detection