Game of Tones: Faculty detection of GPT-4 generated content in   university assessments

Mike Perkins (1); Jasper Roe (2); Darius Postma (1); James McGaughran; (1); Don Hickerson (1) ((1) British University Vietnam; Vietnam; (2) James; Cook University Singapore; Singapore)

arXiv:2305.18081·cs.CY·November 2, 2023·25 cites

Game of Tones: Faculty detection of GPT-4 generated content in university assessments

Mike Perkins (1), Jasper Roe (2), Darius Postma (1), James McGaughran, (1), Don Hickerson (1) ((1) British University Vietnam, Vietnam, (2) James, Cook University Singapore, Singapore)

PDF

Open Access

TL;DR

This study assesses how well faculty can detect GPT-4 generated content in university assessments, revealing limitations of current AI detection tools and suggesting strategies to maintain academic integrity.

Contribution

It provides empirical evidence on the effectiveness and limitations of AI detection tools in academic settings and highlights the need for improved detection methods and assessment strategies.

Findings

01

Detection tool identified 91% of AI submissions but only 54.8% of total AI content

02

Faculty reported 54.5% of AI-generated submissions as misconduct

03

AI-generated content scored similarly to genuine submissions in assessments

Abstract

This study explores the robustness of university assessments against the use of Open AI's Generative Pre-Trained Transformer 4 (GPT-4) generated content and evaluates the ability of academic staff to detect its use when supported by the Turnitin Artificial Intelligence (AI) detection tool. The research involved twenty-two GPT-4 generated submissions being created and included in the assessment process to be marked by fifteen different faculty members. The study reveals that although the detection tool identified 91% of the experimental submissions as containing some AI-generated content, the total detected content was only 54.8%. This suggests that the use of adversarial techniques regarding prompt engineering is an effective method in evading AI detection tools and highlights that improvements to AI detection software are needed. Using the Turnitin AI detect tool, faculty reported…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAcademic integrity and plagiarism

MethodsMulti-Head Attention · Attention Is All You Need · Softmax · Layer Normalization · Byte Pair Encoding · Dropout · Linear Layer · Label Smoothing · Residual Connection · Position-Wise Feed-Forward Layer