Methods for Detecting Paraphrase Plagiarism
Victor Thompson

TL;DR
This paper presents new methods for detecting paraphrase plagiarism by addressing lexical, syntactic, and semantic alterations, and combines them into a model that outperforms existing baseline and previous approaches.
Contribution
It introduces a combined paraphrase detection model that effectively identifies various paraphrasing phenomena, improving detection accuracy over existing methods.
Findings
Significant performance improvement with combined methods
Outperforms baseline greedy string tilling approach
Achieves better results than previous studies
Abstract
Paraphrase plagiarism is one of the difficult challenges facing plagiarism detection systems. Paraphrasing occur when texts are lexically or syntactically altered to look different, but retain their original meaning. Most plagiarism detection systems (many of which are commercial based) are designed to detect word co-occurrences and light modifications, but are unable to detect severe semantic and structural alterations such as what is seen in many academic documents. Hence many paraphrase plagiarism cases go undetected. In this paper, we approached the problem of paraphrase plagiarism by proposing methods for detecting the most common techniques (phenomena) used in paraphrasing texts (namely; lexical substitution, insertion/deletion and word and phrase reordering), and combined the methods into a paraphrase detection model. We evaluated our proposed methods and model on collections…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
