Multi-split Optimized Bagging Ensemble Model Selection for Multi-class Educational Data Mining
MohammadNoor Injadat, Abdallah Moubayed, Ali Bou Nassif, Abdallah, Shami

TL;DR
This paper introduces a multi-split optimized bagging ensemble approach for predicting student performance in educational data mining, demonstrating high accuracy across datasets and stages of course delivery.
Contribution
It proposes a systematic multi-split ensemble selection method using Gini index and p-value for optimizing machine learning models in educational data mining.
Findings
High accuracy achieved in predicting student performance
Effective ensemble model selection improves prediction reliability
Method applicable at different course delivery stages
Abstract
Predicting students' academic performance has been a research area of interest in recent years with many institutions focusing on improving the students' performance and the education quality. The analysis and prediction of students' performance can be achieved using various data mining techniques. Moreover, such techniques allow instructors to determine possible factors that may affect the students' final marks. To that end, this work analyzes two different undergraduate datasets at two different universities. Furthermore, this work aims to predict the students' performance at two stages of course delivery (20% and 50% respectively). This analysis allows for properly choosing the appropriate machine learning algorithms to use as well as optimize the algorithms' parameters. Furthermore, this work adopts a systematic multi-split approach based on Gini index and p-value. This is done by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
