Can AI Recognize Its Own Reflection? Self-Detection Performance of LLMs in Computing Education
Christopher Burger, Karmece Talley, Christina Trotter

TL;DR
This paper evaluates the ability of prominent LLMs to detect AI-generated text in computing education, revealing significant instability and susceptibility to deception that undermine their reliability for academic integrity enforcement.
Contribution
It provides an empirical assessment of LLMs' self-detection capabilities in realistic and deceptive scenarios, highlighting current limitations.
Findings
Detection accuracy drops with deceptive prompts
Models often fail to identify human-written text correctly
Deceptive prompts can completely fool detection models
Abstract
The rapid advancement of Large Language Models (LLMs) presents a significant challenge to academic integrity within computing education. As educators seek reliable detection methods, this paper evaluates the capacity of three prominent LLMs (GPT-4, Claude, and Gemini) to identify AI-generated text in computing-specific contexts. We test their performance under both standard and 'deceptive' prompt conditions, where the models were instructed to evade detection. Our findings reveal a significant instability: while default AI-generated text was easily identified, all models struggled to correctly classify human-written work (with error rates up to 32%). Furthermore, the models were highly susceptible to deceptive prompts, with Gemini's output completely fooling GPT-4. Given that simple prompt alterations significantly degrade detection efficacy, our results demonstrate that these LLMs are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAcademic integrity and plagiarism · Artificial Intelligence in Healthcare and Education · Online Learning and Analytics
