Detecting LLM-Generated Text with Performance Guarantees

Hongyi Zhou; Jin Zhu; Ying Yang; Chengchun Shi

arXiv:2601.06586·cs.CL·January 13, 2026

Detecting LLM-Generated Text with Performance Guarantees

Hongyi Zhou, Jin Zhu, Ying Yang, Chengchun Shi

PDF

Open Access

TL;DR

This paper introduces a new classifier that detects whether text is generated by large language models or humans, offering higher accuracy, statistical inference capabilities, and no reliance on auxiliary information, addressing concerns over misuse.

Contribution

The paper presents a novel LLM detection method that does not depend on watermarks or specific model knowledge, with improved accuracy and statistical inference features.

Findings

01

Achieves higher classification accuracy than existing detectors

02

Maintains type-I error control and high statistical power

03

Operates efficiently on an online platform

Abstract

Large language models (LLMs) such as GPT, Claude, Gemini, and Grok have been deeply integrated into our daily life. They now support a wide range of tasks -- from dialogue and email drafting to assisting with teaching and coding, serving as search engines, and much more. However, their ability to produce highly human-like text raises serious concerns, including the spread of fake news, the generation of misleading governmental reports, and academic misconduct. To address this practical problem, we train a classifier to determine whether a piece of text is authored by an LLM or a human. Our detector is deployed on an online CPU-based platform https://huggingface.co/spaces/stats-powered-ai/StatDetectLLM, and contains three novelties over existing detectors: (i) it does not rely on auxiliary information, such as watermarks or knowledge of the specific LLM used to generate the text; (ii) it…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Topic Modeling · Academic integrity and plagiarism