Detection of ChatGPT Fake Science with the xFakeSci Learning Algorithm

Ahmed Abdeen Hamed; Xindong Wu

arXiv:2308.11767·cs.CL·April 16, 2024·2 cites

Detection of ChatGPT Fake Science with the xFakeSci Learning Algorithm

Ahmed Abdeen Hamed, Xindong Wu

PDF

Open Access

TL;DR

This paper introduces xFakeSci, a novel learning algorithm capable of distinguishing ChatGPT-generated scientific articles from authentic publications, achieving high accuracy and outperforming classical data mining methods across multiple diseases and publication periods.

Contribution

The study presents xFakeSci, a new algorithm that effectively detects AI-generated scientific content, demonstrating superior performance over traditional methods in various conditions and datasets.

Findings

01

xFakeSci achieved F1 scores of 80-94%.

02

It outperformed classical algorithms like SVM, Regression, Naive Bayes.

03

Calibration and proximity heuristics enhanced detection accuracy.

Abstract

Generative AI tools exemplified by ChatGPT are becoming a new reality. This study is motivated by the premise that ``AI generated content may exhibit a distinctive behavior that can be separated from scientific articles''. In this study, we show how articles can be generated using means of prompt engineering for various diseases and conditions. We then show how we tested this premise in two phases and prove its validity. Subsequently, we introduce xFakeSci, a novel learning algorithm, that is capable of distinguishing ChatGPT-generated articles from publications produced by scientists. The algorithm is trained using network models driven from both sources. As for the classification step, it was performed using 300 articles per condition. The actual label steps took place against an equal mix of 50 generated articles and 50 authentic PubMed abstracts. The testing also spanned publication…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Academic integrity and plagiarism · Misinformation and Its Impacts