An Explainable XGBoost-based Approach on Assessing Detection of Deception and Disinformation
Alex V Mbaziira, Maha F Sabir

TL;DR
This paper develops hybrid machine learning models using XGBoost to detect deception and disinformation across various online content types, analyzing feature impacts with SHAP for interpretability.
Contribution
It introduces four novel hybrid models trained on different disinformation and scam datasets, achieving high accuracy and providing insights into feature importance.
Findings
Models achieved 75-85% accuracy in deception detection.
SHAP analysis identified key features influencing predictions.
Hybrid models outperform single-source models in detecting disinformation.
Abstract
Threat actors continue to exploit geopolitical and global public events launch aggressive campaigns propagating disinformation over the Internet. In this paper we extend our prior research in detecting disinformation using psycholinguistic and computational linguistic processes linked to deception and cybercrime to gain an understanding of the features impact the predictive outcome of machine learning models. In this paper we attempt to determine patterns of deception in disinformation in hybrid models trained on disinformation and scams, fake positive and negative online reviews, or fraud using the eXtreme Gradient Boosting machine learning algorithm. Four hybrid models are generated which are models trained on disinformation and fraud (DIS+EN), disinformation and scams (DIS+FB), disinformation and favorable fake reviews (DIS+POS) and disinformation and unfavorable fake reviews…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation and Cyber Security · Deception detection and forensic psychology · Network Security and Intrusion Detection
