Methods for Detecting Paraphrase Plagiarism

Victor Thompson

arXiv:1712.10309·cs.IR·January 1, 2018·1 cites

Methods for Detecting Paraphrase Plagiarism

Victor Thompson

PDF

Open Access

TL;DR

This paper presents new methods for detecting paraphrase plagiarism by addressing lexical, syntactic, and semantic alterations, and combines them into a model that outperforms existing baseline and previous approaches.

Contribution

It introduces a combined paraphrase detection model that effectively identifies various paraphrasing phenomena, improving detection accuracy over existing methods.

Findings

01

Significant performance improvement with combined methods

02

Outperforms baseline greedy string tilling approach

03

Achieves better results than previous studies

Abstract

Paraphrase plagiarism is one of the difficult challenges facing plagiarism detection systems. Paraphrasing occur when texts are lexically or syntactically altered to look different, but retain their original meaning. Most plagiarism detection systems (many of which are commercial based) are designed to detect word co-occurrences and light modifications, but are unable to detect severe semantic and structural alterations such as what is seen in many academic documents. Hence many paraphrase plagiarism cases go undetected. In this paper, we approached the problem of paraphrase plagiarism by proposing methods for detecting the most common techniques (phenomena) used in paraphrasing texts (namely; lexical substitution, insertion/deletion and word and phrase reordering), and combined the methods into a paraphrase detection model. We evaluated our proposed methods and model on collections…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification