Plagiarism Detection using ROUGE and WordNet
Chien-Ying Chen, Jen-Yuan Yeh, Hao-Ren Ke

TL;DR
This paper proposes a novel plagiarism detection method combining ROUGE metrics and WordNet to improve detection of modified and paraphrased content, addressing limitations of traditional fingerprinting approaches.
Contribution
It introduces a hybrid approach using ROUGE and WordNet, enhancing detection of various types of text modifications in plagiarism cases.
Findings
ROUGE-based metrics detect verbatim and sentence modifications
WordNet improves detection of word substitutions
The combined method handles simple text additions or deletions effectively
Abstract
With the arrival of digital era and Internet, the lack of information control provides an incentive for people to freely use any content available to them. Plagiarism occurs when users fail to credit the original owner for the content referred to, and such behavior leads to violation of intellectual property. Two main approaches to plagiarism detection are fingerprinting and term occurrence; however, one common weakness shared by both approaches, especially fingerprinting, is the incapability to detect modified text plagiarism. This study proposes adoption of ROUGE and WordNet to plagiarism detection. The former includes ngram co-occurrence statistics, skip-bigram, and longest common subsequence (LCS), while the latter acts as a thesaurus and provides semantic information. N-gram co-occurrence statistics can detect verbatim copy and certain sentence modification, skip-bigram and LCS are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Authorship Attribution and Profiling · Academic integrity and plagiarism
