Academic Source Code Plagiarism Detection by Measuring Program Behavioural Similarity
Hayden Cheers, Yuqing Lin, Shamus P. Smith

TL;DR
BPlag is a novel behavioral approach using symbolic execution and graph comparison to detect source code plagiarism more robustly and accurately, especially against transformations that hide plagiarized content.
Contribution
The paper introduces BPlag, a new plagiarism detection method that analyzes program behavior with symbolic execution and graph similarity, improving robustness and accuracy over existing tools.
Findings
BPlag outperforms existing tools in robustness to hiding transformations.
BPlag achieves higher accuracy in detecting plagiarized code.
BPlag is less efficient but more reliable than current methods.
Abstract
Source code plagiarism is a long-standing issue in tertiary computer science education. Many source code plagiarism detection tools have been proposed to aid in the detection of source code plagiarism. However, existing detection tools are not robust to pervasive plagiarism-hiding transformations, and as a result can be inaccurate in the detection of plagiarised source code. This article presents BPlag, a behavioural approach to source code plagiarism detection. BPlag is designed to be both robust to pervasive plagiarism-hiding transformations, and accurate in the detection of plagiarised source code. Greater robustness and accuracy is afforded by analysing the behaviour of a program, as behaviour is perceived to be the least susceptible aspect of a program impacted upon by plagiarism-hiding transformations. BPlag applies symbolic execution to analyse execution behaviour and represent a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
