Is it Cake or is it AI? A Systematic Review of Human Uncertainty in Distinguishing Generative Artificial Intelligence Content
Mark Louie F. Ramos

TL;DR
This systematic review examines human ability to distinguish AI-generated content from human-created content across multiple modalities, revealing humans are generally unreliable detectors with implications for content evaluation.
Contribution
It synthesizes empirical evidence from 30 studies, highlighting the unreliability of humans in detecting AI content and raising questions about trust and evaluation.
Findings
Detection accuracy varies widely across studies.
Humans generally perform at chance level in identifying AI content.
The literature questions the importance of detection ability for content trustworthiness.
Abstract
This systematic review synthesized empirical evidence on human ability to distinguish generative artificial intelligence content from human produced content across text, image, and voice modalities. A structured search of Scopus identified 22,541 records from 2025 to 2026, of which 1200 were screened and 30 studies were included. Across these studies, human detection accuracy varied widely but generally clustered around chance performance. Overall, the literature shows that humans are generally unreliable detectors of gen AI content, raising broader questions about whether the ability to tell should matter for how we evaluate or trust content.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
