Human vs. Machine Deception: Distinguishing AI-Generated and Human-Written Fake News Using Ensemble Learning
Samuel Jaeger, Calvin Ibeneye, Aya Vera-Jimenez, Dhrubajyoti Ghosh

TL;DR
This paper investigates linguistic, structural, and emotional differences between AI-generated and human-written fake news, demonstrating that ensemble learning models can reliably distinguish between the two based on stylistic features.
Contribution
It introduces a comprehensive feature set and ensemble classification framework to effectively differentiate AI-generated fake news from human-written misinformation.
Findings
Readability features are the most informative predictors.
Ensemble models outperform individual classifiers.
AI-generated text shows more stylistic uniformity.
Abstract
The rapid adoption of large language models has introduced a new class of AI-generated fake news that coexists with traditional human-written misinformation, raising important questions about how these two forms of deceptive content differ and how reliably they can be distinguished. This study examines linguistic, structural, and emotional differences between human-written and AI-generated fake news and evaluates machine learning and ensemble-based methods for distinguishing these content types. A document-level feature representation is constructed using sentence structure, lexical diversity, punctuation patterns, readability indices, and emotion-based features capturing affective dimensions such as fear, anger, joy, sadness, trust, and anticipation. Multiple classification models, including logistic regression, random forest, support vector machines, extreme gradient boosting, and a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
