Plagiarism Detection on Electronic Text based Assignments using Vector Space Model (ICIAfS14)
MAC Jiffriya, MAC Akmal Jahan, and Roshan G. Ragel

TL;DR
This paper presents a plagiarism detection method for text-based assignments using a trigram vector space model with cosine similarity, demonstrating improved accuracy over sequence matching techniques.
Contribution
The paper introduces a novel plagiarism detection approach combining trigram vector space model with cosine similarity for better accuracy in identifying intra-corpal plagiarism.
Findings
Trigram vector space model yields better detection results.
Cosine similarity outperforms Jaccard measure in this context.
The proposed tool effectively minimizes student plagiarism.
Abstract
Plagiarism is known as illegal use of others' part of work or whole work as one's own in any field such as art, poetry, literature, cinema, research and other creative forms of study. Plagiarism is one of the important issues in academic and research fields and giving more concern in academic systems. The situation is even worse with the availability of ample resources on the web. This paper focuses on an effective plagiarism detection tool on identifying suitable intra-corpal plagiarism detection for text based assignments by comparing unigram, bigram, trigram of vector space model with cosine similarity measure. Manually evaluated, labelled dataset was tested using unigram, bigram and trigram vector. Even though trigram vector consumes comparatively more time, it shows better results with the labelled data. In addition, the selected trigram vector space model with cosine similarity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAcademic integrity and plagiarism · Imbalanced Data Classification Techniques · Topic Modeling
