BERT-Enhanced Retrieval Tool for Homework Plagiarism Detection System
Jiarong Xian, Jibao Yuan, Peiwei Zheng, Dexian Chen, Nie yuntao

TL;DR
This paper introduces a BERT-enhanced retrieval system for homework plagiarism detection, utilizing GPT-3.5 generated datasets and achieving high accuracy, precision, recall, and F1 scores in identifying plagiarized texts.
Contribution
It presents a novel dataset generation method using GPT-3.5 and a high-efficiency plagiarism detection approach based on BERT and Faiss, improving detection performance.
Findings
Achieved over 98.8% accuracy and F1 score in plagiarism detection.
Generated 32,927 diverse plagiarism detection datasets with GPT-3.5.
Developed a user-friendly demo platform for plagiarism analysis.
Abstract
Text plagiarism detection task is a common natural language processing task that aims to detect whether a given text contains plagiarism or copying from other texts. In existing research, detection of high level plagiarism is still a challenge due to the lack of high quality datasets. In this paper, we propose a plagiarized text data generation method based on GPT-3.5, which produces 32,927 pairs of text plagiarism detection datasets covering a wide range of plagiarism methods, bridging the gap in this part of research. Meanwhile, we propose a plagiarism identification method based on Faiss with BERT with high efficiency and high accuracy. Our experiments show that the performance of this model outperforms other models in several metrics, including 98.86\%, 98.90%, 98.86%, and 0.9888 for Accuracy, Precision, Recall, and F1 Score, respectively. At the end, we also provide a user-friendly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAcademic integrity and plagiarism
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · {Dispute@FaQ-s}How to file a dispute with Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Adam · Cosine Annealing · Byte Pair Encoding · Softmax · Linear Layer · Dense Connections
