Technical Report on the Pangram AI-Generated Text Classifier
Bradley Emi, Max Spero

TL;DR
This paper introduces Pangram Text, a transformer-based AI text classifier that significantly outperforms existing tools across diverse domains and models, with low false positive rates and broad generalization.
Contribution
The paper presents a novel transformer-based classifier trained with hard negative mining, achieving superior accuracy and generalization in AI-generated text detection.
Findings
Outperforms zero-shot detection methods and commercial tools
Achieves over 38 times lower error rates on diverse benchmarks
Maintains low false positives and generalizes well to unseen domains and models
Abstract
We present Pangram Text, a transformer-based neural network trained to distinguish text written by large language models from text written by humans. Pangram Text outperforms zero-shot methods such as DetectGPT as well as leading commercial AI detection tools with over 38 times lower error rates on a comprehensive benchmark comprised of 10 text domains (student writing, creative writing, scientific writing, books, encyclopedias, news, email, scientific papers, short-form Q&A) and 8 open- and closed-source large language models. We propose a training algorithm, hard negative mining with synthetic mirrors, that enables our classifier to achieve orders of magnitude lower false positive rates on high-data domains such as reviews. Finally, we show that Pangram Text is not biased against nonnative English speakers and generalizes to domains and models unseen during training.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies
