Interpretable Text Classification Applied to the Detection of LLM-generated Creative Writing

Minerva Suvanto; Andrea McGlinchey; Mattias Wahde; Peter J Barclay

arXiv:2601.07368·cs.CL·January 13, 2026

Interpretable Text Classification Applied to the Detection of LLM-generated Creative Writing

Minerva Suvanto, Andrea McGlinchey, Mattias Wahde, Peter J Barclay

PDF

Open Access

TL;DR

This paper demonstrates that machine learning models can accurately distinguish human-written from LLM-generated creative writing, and uses interpretable models to reveal key linguistic features that enable this detection.

Contribution

The study introduces an interpretable linear classifier achieving high accuracy in detecting LLM-generated text and identifies specific linguistic features that underpin this classification.

Findings

01

Machine learning models achieve 93-98% accuracy in detection.

02

Interpretable features include synonym variety and language usage patterns.

03

Detection features are robust and hard to evade.

Abstract

We consider the problem of distinguishing human-written creative fiction (excerpts from novels) from similar text generated by an LLM. Our results show that, while human observers perform poorly (near chance levels) on this binary classification task, a variety of machine-learning models achieve accuracy in the range 0.93 - 0.98 over a previously unseen test set, even using only short samples and single-token (unigram) features. We therefore employ an inherently interpretable (linear) classifier (with a test accuracy of 0.98), in order to elucidate the underlying reasons for this high accuracy. In our analysis, we identify specific unigram features indicative of LLM-generated text, one of the most important being that the LLM tends to use a larger variety of synonyms, thereby skewing the probability distributions in a manner that is easy to detect for a machine learning classifier, yet…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuthorship Attribution and Profiling · Text Readability and Simplification · Artificial Intelligence in Games