Analyzing Non-Textual Content Elements to Detect Academic Plagiarism
Norman Meuschke

TL;DR
This paper introduces a novel approach to academic plagiarism detection by analyzing non-textual content like citations, images, and mathematical expressions, which enhances detection of disguised plagiarism forms that evade traditional text-based methods.
Contribution
It proposes and validates a new multi-modal plagiarism detection system that combines non-textual content analysis with traditional text similarity measures, improving detection effectiveness.
Findings
Non-textual content contains high semantic information.
Non-textual content is language-independent and resistant to concealment.
Combining non-textual and text-based methods improves detection accuracy.
Abstract
Identifying academic plagiarism is a pressing problem, among others, for research institutions, publishers, and funding organizations. Detection approaches proposed so far analyze lexical, syntactical, and semantic text similarity. These approaches find copied, moderately reworded, and literally translated text. However, reliably detecting disguised plagiarism, such as strong paraphrases, sense-for-sense translations, and the reuse of non-textual content and ideas, is an open research problem. The thesis addresses this problem by proposing plagiarism detection approaches that implement a different concept: analyzing non-textual content in academic documents, specifically citations, images, and mathematical content. To validate the effectiveness of the proposed detection approaches, the thesis presents five evaluations that use real cases of academic plagiarism and exploratory…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Academic integrity and plagiarism · Software Engineering Research
