The Struggle with Academic Plagiarism: Approaches based on Semantic   Similarity

Tedo Vrbanec; Ana Mestrovic

arXiv:2106.04404·cs.IR·June 9, 2021·1 cites

The Struggle with Academic Plagiarism: Approaches based on Semantic Similarity

Tedo Vrbanec, Ana Mestrovic

PDF

Open Access

TL;DR

This paper discusses the use of semantic similarity measures to improve academic plagiarism detection, addressing challenges like paraphrasing that current software struggles to identify.

Contribution

It introduces the application of semantic similarity techniques to enhance plagiarism detection, focusing on paraphrasing and obfuscation issues.

Findings

01

Semantic similarity measures can detect paraphrased content

02

Current software is effective but struggles with obfuscation

03

Semantic approaches offer promising improvements

Abstract

Academic plagiarism is a serious problem nowadays. Due to the existence of inexhaustible sources of digital information, today it is easier to plagiarize more than ever before. The good thing is that plagiarism detection techniques have improved and are powerful enough to detect attempts of plagiarism in education. We are now witnessing efficient plagiarism detection software in action, such as Turnitin, iThenticate or SafeAssign. In the introduction we explore software that is used within the Croatian academic community for plagiarism detection in universities and/or in scientific journals. The question is: is this enough? Current software has proven to be successful, however the problem of identifying paraphrasing or obfuscation plagiarism remains unresolved. In this paper we present a report of how semantic similarity measures can be used in the plagiarism detection task.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Academic integrity and plagiarism · Natural Language Processing Techniques