Web Based Cross Language Plagiarism Detection

Chow Kok Kent; Naomie Salim

arXiv:0912.3959·cs.OH·December 22, 2009·2 cites

Web Based Cross Language Plagiarism Detection

Chow Kok Kent, Naomie Salim

PDF

Open Access

TL;DR

This paper presents a web-based system for detecting cross-language plagiarism, specifically translation plagiarism, by combining translation, text preprocessing, fingerprint matching, and information retrieval techniques.

Contribution

It introduces a novel system that integrates Google Translate, Google Search, and fingerprint matching for effective translation plagiarism detection.

Findings

01

Uses 4-gram fingerprint matching for comparison

02

Translates Malay to English for detection process

03

Achieves effective discrimination between similar and dissimilar texts

Abstract

As the Internet help us cross language and cultural border by providing different types of translation tools, cross language plagiarism, also known as translation plagiarism are bound to arise. Especially among the academic works, such issue will definitely affect the student's works including the quality of their assignments and paper works. In this paper, we propose a new approach in detecting cross language plagiarism. Our web based cross language plagiarism detection system is specially tuned to detect translation plagiarism by implementing different techniques and tools to assist the detection process. Google Translate API is used as our translation tool and Google Search API, which is used in our information retrieval process. Our system is also integrated with the fingerprint matching technique, which is a widely used plagiarism detection technique. In general, our proposed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImbalanced Data Classification Techniques · Academic integrity and plagiarism · Topic Modeling