An Optimized Machine Learning Classifier for Detecting Fake Reviews Using Extracted Features

Shabbir Anees; Anshuman; Ayush Chaurasia; Prathmesh Bogar

arXiv:2511.21716·cs.CL·December 1, 2025

An Optimized Machine Learning Classifier for Detecting Fake Reviews Using Extracted Features

Shabbir Anees, Anshuman, Ayush Chaurasia, Prathmesh Bogar

PDF

Open Access

TL;DR

This paper presents a machine learning system that effectively detects AI-generated fake reviews by combining advanced text processing, feature selection with Harris Hawks Optimization, and ensemble classification, achieving high accuracy on a large dataset.

Contribution

It introduces a novel combination of feature extraction, bio-inspired optimization, and ensemble learning for improved detection of computer-generated reviews.

Findings

01

Achieved 95.40% accuracy in detecting AI-generated reviews.

02

Reduced feature set by 89.9% using Harris Hawks Optimization.

03

Demonstrated effectiveness of ensemble learning combined with bio-inspired optimization.

Abstract

It is well known that fraudulent reviews cast doubt on the legitimacy and dependability of online purchases. The most recent development that leads customers towards darkness is the appearance of human reviews in computer-generated (CG) ones. In this work, we present an advanced machine-learning-based system that analyses these reviews produced by AI with remarkable precision. Our method integrates advanced text preprocessing, multi-modal feature extraction, Harris Hawks Optimization (HHO) for feature selection, and a stacking ensemble classifier. We implemented this methodology on a public dataset of 40,432 Original (OR) and Computer-Generated (CG) reviews. From an initial set of 13,539 features, HHO selected the most applicable 1,368 features, achieving an 89.9% dimensionality reduction. Our final stacking model achieved 95.40% accuracy, 92.81% precision, 95.01% recall, and a 93.90%…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpam and Phishing Detection · Misinformation and Its Impacts · Hate Speech and Cyberbullying Detection