A Lightweight Approach to Detection of AI-Generated Texts Using Stylometric Features
Sergey K. Aityan, William Claster, Karthik Sai Emani, Sohni Rais, Thy Tran

TL;DR
This paper presents NEULIF, a lightweight, computationally efficient method for detecting AI-generated texts using stylometric and readability features, achieving high accuracy comparable to complex models.
Contribution
Introduces NEULIF, a novel lightweight detection approach that combines stylometric features with simple classifiers, outperforming existing lightweight methods in accuracy and efficiency.
Findings
Achieves 97% accuracy with CNN on Kaggle dataset
Models are significantly smaller and faster than transformer-based methods
High potential for cross-language and real-time applications
Abstract
A growing number of AI-generated texts raise serious concerns. Most existing approaches to AI-generated text detection rely on fine-tuning large transformer models or building ensembles, which are computationally expensive and often provide limited generalization across domains. Existing lightweight alternatives achieved significantly lower accuracy on large datasets. We introduce NEULIF, a lightweight approach that achieves best performance in the lightweight detector class, that does not require extensive computational power and provides high detection accuracy. In our approach, a text is first decomposed into stylometric and readability features which are then used for classification by a compact Convolutional Neural Network (CNN) or Random Forest (RF). Evaluated and tested on the Kaggle AI vs. Human corpus, our models achieve 97% accuracy (~ 0.95 F1) for CNN and 95% accuracy (~ 0.94…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuthorship Attribution and Profiling · Text Readability and Simplification · Topic Modeling
