Mossad: Defeating Software Plagiarism Detection
Breanna Devore-McDonald, Emery D. Berger

TL;DR
This paper introduces Mossad, an automatic program transformation framework that effectively defeats popular software plagiarism detection tools, raising concerns about current detection robustness and emphasizing the need for improved methods.
Contribution
Mossad is a novel, non-deterministic framework combining genetic programming and domain knowledge to evade multiple plagiarism detectors efficiently.
Findings
Mossad defeats four major plagiarism detection tools including Moss and JPlag.
It can generate dozens of variants from a single program in minutes.
Generated code is rated as equally readable as authentic student submissions.
Abstract
Automatic software plagiarism detection tools are widely used in educational settings to ensure that submitted work was not copied. These tools have grown in use together with the rise in enrollments in computer science programs and the widespread availability of code on-line. Educators rely on the robustness of plagiarism detection tools; the working assumption is that the effort required to evade detection is as high as that required to actually do the assigned work. This paper shows this is not the case. It presents an entirely automatic program transformation approach, Mossad, that defeats popular software plagiarism detection tools. Mossad comprises a framework that couples techniques inspired by genetic programming with domain-specific knowledge to effectively undermine plagiarism detectors. Mossad is effective at defeating four plagiarism detectors, including Moss and JPlag.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
