Web Based Cross Language Plagiarism Detection
Chow Kok Kent, Naomie Salim

TL;DR
This paper presents a web-based system for detecting cross-language plagiarism, specifically translation plagiarism, by combining translation, text preprocessing, fingerprint matching, and information retrieval techniques.
Contribution
It introduces a novel system that integrates Google Translate, Google Search, and fingerprint matching for effective translation plagiarism detection.
Findings
Uses 4-gram fingerprint matching for comparison
Translates Malay to English for detection process
Achieves effective discrimination between similar and dissimilar texts
Abstract
As the Internet help us cross language and cultural border by providing different types of translation tools, cross language plagiarism, also known as translation plagiarism are bound to arise. Especially among the academic works, such issue will definitely affect the student's works including the quality of their assignments and paper works. In this paper, we propose a new approach in detecting cross language plagiarism. Our web based cross language plagiarism detection system is specially tuned to detect translation plagiarism by implementing different techniques and tools to assist the detection process. Google Translate API is used as our translation tool and Google Search API, which is used in our information retrieval process. Our system is also integrated with the fingerprint matching technique, which is a widely used plagiarism detection technique. In general, our proposed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques · Academic integrity and plagiarism · Topic Modeling
