Detecting AI-Generated Text in Educational Content: Leveraging Machine   Learning and Explainable AI for Academic Integrity

Ayat A. Najjar; Huthaifa I. Ashqar; Omar A. Darwish; and Eman Hammad

arXiv:2501.03203·cs.CL·January 7, 2025·2 cites

Detecting AI-Generated Text in Educational Content: Leveraging Machine Learning and Explainable AI for Academic Integrity

Ayat A. Najjar, Huthaifa I. Ashqar, Omar A. Darwish, and Eman Hammad

PDF

Open Access

TL;DR

This paper develops machine learning and explainable AI tools to detect AI-generated educational content, introduces a new dataset, and demonstrates improved accuracy over existing systems like GPTZero.

Contribution

The creation of the CyberHumanAI dataset and the development of a fine-tuned ML model that outperforms GPTZero in classifying AI-generated and human-written educational texts.

Findings

01

XGBoost and Random Forest achieve over 80% accuracy.

02

Shorter texts are more challenging to classify.

03

Explainable AI reveals linguistic features distinguishing human and AI texts.

Abstract

This study seeks to enhance academic integrity by providing tools to detect AI-generated content in student work using advanced technologies. The findings promote transparency and accountability, helping educators maintain ethical standards and supporting the responsible integration of AI in education. A key contribution of this work is the generation of the CyberHumanAI dataset, which has 1000 observations, 500 of which are written by humans and the other 500 produced by ChatGPT. We evaluate various machine learning (ML) and deep learning (DL) algorithms on the CyberHumanAI dataset comparing human-written and AI-generated content from Large Language Models (LLMs) (i.e., ChatGPT). Results demonstrate that traditional ML algorithms, specifically XGBoost and Random Forest, achieve high performance (83% and 81% accuracies respectively). Results also show that classifying shorter content…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI)