Impeding LLM-assisted Cheating in Introductory Programming Assignments via Adversarial Perturbation
Saiful Islam Salim, Rubin Yuchan Yang, Alexander Cooper, Suryashree, Ray, Saumya Debray, Sazzadur Rahaman

TL;DR
This paper explores methods to hinder large language models from effectively assisting in cheating on introductory programming assignments by using adversarial perturbations, aiming to preserve academic integrity.
Contribution
It introduces adversarial perturbation techniques to significantly reduce LLMs' ability to generate correct code for educational tasks.
Findings
Perturbations reduced LLM correctness scores by 77%.
Detectability of perturbations influences their effectiveness.
Baseline performance of 5 LLMs on programming problems was established.
Abstract
While Large language model (LLM)-based programming assistants such as CoPilot and ChatGPT can help improve the productivity of professional software developers, they can also facilitate cheating in introductory computer programming courses. Assuming instructors have limited control over the industrial-strength models, this paper investigates the baseline performance of 5 widely used LLMs on a collection of introductory programming problems, examines adversarial perturbations to degrade their performance, and describes the results of a user study aimed at understanding the efficacy of such perturbations in hindering actual code generation for introductory programming assignments. The user study suggests that i) perturbations combinedly reduced the average correctness score by 77%, ii) the drop in correctness caused by these perturbations was affected based on their detectability.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Academic integrity and plagiarism
