Stylometry Analysis of Human and Machine Text for Academic Integrity
Hezam Albaqami, Muhammad Asif Ayub, Nasir Ahmad, Yaseen Ahmad, Mohammed M. Alqahtani, Abdullah M. Algamdi, Almoaid A. Owaidah, Kashif Ahmad

TL;DR
This paper presents a comprehensive NLP-based framework for distinguishing human from machine-generated text and detecting author changes to uphold academic integrity, evaluated on datasets with varying prompt strictness.
Contribution
It introduces a multi-task analysis of human and machine text, including author attribution and style change detection, with publicly available datasets and code for future research.
Findings
Performance drops on strict prompt datasets highlight detection challenges.
Proposed methods effectively differentiate human and machine text.
Datasets and tools are publicly accessible for benchmarking.
Abstract
This work addresses critical challenges to academic integrity, including plagiarism, fabrication, and verification of authorship of educational content, by proposing a Natural Language Processing (NLP)-based framework for authenticating students' content through author attribution and style change detection. Despite some initial efforts, several aspects of the topic are yet to be explored. In contrast to existing solutions, the paper provides a comprehensive analysis of the topic by targeting four relevant tasks, including (i) classification of human and machine text, (ii) differentiating in single and multi-authored documents, (iii) author change detection within multi-authored documents, and (iv) author recognition in collaboratively produced documents. The solutions proposed for the tasks are evaluated on two datasets generated with Gemini using two different prompts, including a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAcademic integrity and plagiarism · Authorship Attribution and Profiling · Hate Speech and Cyberbullying Detection
