Generative AI Usage and Exam Performance
Janik Ole Wecks, Johannes Voshaar, Benedikt Jost Plate, Jochen, Zimmermann

TL;DR
This study investigates how students' use of generative AI tools like ChatGPT affects their exam scores, revealing that usage correlates with lower performance, especially among high-potential students, highlighting concerns for educational policy.
Contribution
It provides empirical evidence on the negative impact of GenAI tools on exam performance and explores the mechanisms behind this effect, informing educational policy debates.
Findings
GenAI users score 6.71 points lower on average.
The negative effect is stronger among high-potential students.
GenAI usage may hinder learning rather than enhance it.
Abstract
This study evaluates the impact of students' usage of generative artificial intelligence (GenAI) tools such as ChatGPT on their exam performance. We analyse student essays using GenAI detection systems to identify GenAI users among the cohort. Employing multivariate regression analysis, we find that students using GenAI tools score on average 6.71 (out of 100) points lower than non-users. While GenAI may offer benefits for learning and engagement, the way students actually use it correlates with diminished exam outcomes. Exploring the underlying mechanism, additional analyses show that the effect is particularly detrimental to students with high learning potential, suggesting an effect whereby GenAI tool usage hinders learning. Our findings provide important empirical evidence for the ongoing debate on the integration of GenAI in higher education and underscores the necessity for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOnline Learning and Analytics
