Detecting Plagiarism based on the Creation Process
Johannes Schneider, Avi Bernstein, Jan Vom Brocke, Kostadin Damevski,, David C. Shepherd

TL;DR
This paper introduces a novel plagiarism detection method that analyzes authors' interaction logs with software, comparing command usage histograms to identify suspicious similarities or deviations indicative of plagiarism.
Contribution
It proposes a new approach that considers creation process logs rather than final outputs, enabling detection of plagiarism in both unique and identical task scenarios.
Findings
Effective in detecting plagiarism using interaction logs
Works well with programming assignments from over sixty students
Supports detection in both unique and identical task contexts
Abstract
All methodologies for detecting plagiarism to date have focused on the final digital "outcome", such as a document or source code. Our novel approach takes the creation process into account using logged events collected by special software or by the macro recorders found in most office applications. We look at an author's interaction logs with the software used to create the work. Detection relies on comparing the histograms of multiple logs' command use. A work is classified as plagiarism if its log deviates too much from logs of "honestly created" works or if its log is too similar to another log. The technique supports the detection of plagiarism for digital outcomes that stem from \emph{unique} tasks, such as theses and \emph{equal} tasks such as assignments for which the same problem sets are solved by multiple students. Focusing on the latter case, we evaluate this approach using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
