Why Do Students Drop Out? University Dropout Prediction and Associated Factor Analysis Using Machine Learning Techniques
Sean Kim, Eliot Yoo, Samuel Kim

TL;DR
This study uses machine learning to predict university dropout rates based on various data types, finding academic data most significantly influences prediction accuracy with high ROC-AUC scores.
Contribution
It introduces a machine learning approach for dropout prediction and identifies academic data as the most influential factor in model performance.
Findings
ROC-AUC score of 0.935 for dropout prediction
Academic data significantly impacts model accuracy
Dropout prediction models can effectively identify at-risk students
Abstract
Graduation and dropout rates have always been a serious consideration for educational institutions and students. High dropout rates negatively impact both the lives of individual students and institutions. To address this problem, this study examined university dropout prediction using academic, demographic, socioeconomic, and macroeconomic data types. Additionally, we performed associated factor analysis to analyze which type of data would be most influential on the performance of machine learning models in predicting graduation and dropout status. These features were used to train four binary classifiers to determine if students would graduate or drop out. The overall performance of the classifiers in predicting dropout status had an average ROC-AUC score of 0.935. The data type most influential to the model performance was found to be academic data, with the average ROC-AUC score…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOnline Learning and Analytics
MethodsDropout
