People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text

Jenna Russell; Marzena Karpinska; Mohit Iyyer

arXiv:2501.15654·cs.CL·May 21, 2025·3 cites

People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text

Jenna Russell, Marzena Karpinska, Mohit Iyyer

PDF

Open Access 1 Repo 1 Video

TL;DR

Frequent users of ChatGPT are highly accurate and robust at manually detecting AI-generated text, outperforming many automated detectors, especially when analyzing complex textual cues.

Contribution

This study demonstrates that human annotators with frequent ChatGPT usage excel at identifying AI-generated text, highlighting the importance of user experience over specialized training.

Findings

01

Expert annotators misclassify only 1 out of 300 articles

02

Experts outperform most commercial and open-source detectors

03

Humans rely on lexical clues and complex text phenomena

Abstract

In this paper, we study how well humans can detect text generated by commercial LLMs (GPT-4o, Claude, o1). We hire annotators to read 300 non-fiction English articles, label them as either human-written or AI-generated, and provide paragraph-length explanations for their decisions. Our experiments show that annotators who frequently use LLMs for writing tasks excel at detecting AI-generated text, even without any specialized training or feedback. In fact, the majority vote among five such "expert" annotators misclassifies only 1 of 300 articles, significantly outperforming most commercial and open-source detectors we evaluated even in the presence of evasion tactics like paraphrasing and humanization. Qualitative analysis of the experts' free-form explanations shows that while they rely heavily on specific lexical clues ('AI vocabulary'), they also pick up on more complex phenomena…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jenna-russell/human_detectors
pytorchOfficial

Videos

People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text· underline

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education