Experiences with Remote Examination Formats in Light of GPT-4
Felix Dobslaw, Peter Bergh

TL;DR
This study evaluates the viability of open-book and oral remote exams in the era of GPT-4, analyzing their effectiveness, workload, and impact on grade distribution in a Software Engineering program.
Contribution
It provides an empirical comparison of exam formats before and after GPT-4, highlighting the resilience of open-book exams and the challenges of GPT-proof assessments.
Findings
Open-book exams are not GPT-4 proof.
Grade distributions are similar across formats.
Open-book exams have higher throughput but also higher fail rates.
Abstract
Sudden access to the rapidly improving large language model GPT by open-ai forces educational institutions worldwide to revisit their exam procedures. In the pre-GPT era, we successfully applied oral and open-book home exams for two courses in the third year of our predominantly remote Software Engineering BSc program. We ask in this paper whether our current open-book exams are still viable or whether a move back to a legally compliant but less scalable oral exam is the only workable alternative. We further compare work-effort estimates between oral and open-book exams and report on differences in throughput and grade distribution over eight years to better understand the impact of examination format on the outcome. Examining GPT v4 on the most recent open-book exams showed that our current Artificial Intelligence and Reactive Programming exams are not GPT v4 proof. Three potential…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education
