An LSTM-based Plagiarism Detection via Attention Mechanism and a   Population-based Approach for Pre-Training Parameters with imbalanced Classes

Seyed Vahid Moravvej; Seyed Jalaleddin Mousavirad; Mahshid Helali; Moghadam; Mehrdad Saadatmand

arXiv:2110.08771·cs.LG·October 19, 2021

An LSTM-based Plagiarism Detection via Attention Mechanism and a Population-based Approach for Pre-Training Parameters with imbalanced Classes

Seyed Vahid Moravvej, Seyed Jalaleddin Mousavirad, Mahshid Helali, Moghadam, Mehrdad Saadatmand

PDF

Open Access

TL;DR

This paper introduces an LSTM-based plagiarism detection model enhanced with an attention mechanism and a population-based approach for pre-training, improving initialization and performance in class-imbalanced scenarios.

Contribution

It proposes a novel combination of LSTM, attention, and artificial bee colony algorithms for better parameter initialization in plagiarism detection models.

Findings

01

The method achieves competitive performance compared to traditional approaches.

02

Population-based initialization improves convergence and detection accuracy.

03

The approach effectively handles class imbalance in plagiarism detection tasks.

Abstract

Plagiarism is one of the leading problems in academic and industrial environments, which its goal is to find the similar items in a typical document or source code. This paper proposes an architecture based on a Long Short-Term Memory (LSTM) and attention mechanism called LSTM-AM-ABC boosted by a population-based approach for parameter initialization. Gradient-based optimization algorithms such as back-propagation (BP) are widely used in the literature for learning process in LSTM, attention mechanism, and feed-forward neural network, while they suffer from some problems such as getting stuck in local optima. To tackle this problem, population-based metaheuristic (PBMH) algorithms can be used. To this end, this paper employs a PBMH algorithm, artificial bee colony (ABC), to moderate the problem. Our proposed algorithm can find the initial values for model learning in all LSTM, attention…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAcademic integrity and plagiarism · Imbalanced Data Classification Techniques · Machine Learning and Data Classification

MethodsTanh Activation · Sigmoid Activation · Approximate Bayesian Computation · Long Short-Term Memory