The Struggle with Academic Plagiarism: Approaches based on Semantic Similarity
Tedo Vrbanec, Ana Mestrovic

TL;DR
This paper discusses the use of semantic similarity measures to improve academic plagiarism detection, addressing challenges like paraphrasing that current software struggles to identify.
Contribution
It introduces the application of semantic similarity techniques to enhance plagiarism detection, focusing on paraphrasing and obfuscation issues.
Findings
Semantic similarity measures can detect paraphrased content
Current software is effective but struggles with obfuscation
Semantic approaches offer promising improvements
Abstract
Academic plagiarism is a serious problem nowadays. Due to the existence of inexhaustible sources of digital information, today it is easier to plagiarize more than ever before. The good thing is that plagiarism detection techniques have improved and are powerful enough to detect attempts of plagiarism in education. We are now witnessing efficient plagiarism detection software in action, such as Turnitin, iThenticate or SafeAssign. In the introduction we explore software that is used within the Croatian academic community for plagiarism detection in universities and/or in scientific journals. The question is: is this enough? Current software has proven to be successful, however the problem of identifying paraphrasing or obfuscation plagiarism remains unresolved. In this paper we present a report of how semantic similarity measures can be used in the plagiarism detection task.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Academic integrity and plagiarism · Natural Language Processing Techniques
