Early Detection of At-Risk Students Using Machine Learning

Azucena L. Jimenez Martinez; Kanika Sood; Rakeshkumar Mahto

arXiv:2412.09483·cs.LG·July 16, 2025

Early Detection of At-Risk Students Using Machine Learning

Azucena L. Jimenez Martinez, Kanika Sood, Rakeshkumar Mahto

PDF

Open Access

TL;DR

This study explores machine learning techniques to early identify at-risk students using diverse data sources, aiming to improve retention and student success at a university.

Contribution

It introduces a multi-data approach and compares several machine learning models for predicting at-risk students, highlighting Naive Bayes as the most effective.

Findings

01

Naive Bayes outperforms other models in prediction accuracy.

02

All models provide acceptable results for identifying at-risk students.

03

Critical periods of vulnerability during the semester are identified.

Abstract

This research presents preliminary work to address the challenge of identifying at-risk students using supervised machine learning and three unique data categories: engagement, demographics, and performance data collected from Fall 2023 using Canvas and the California State University, Fullerton dashboard. We aim to tackle the persistent challenges of higher education retention and student dropout rates by screening for at-risk students and building a high-risk identification system. By focusing on previously overlooked behavioral factors alongside traditional metrics, this work aims to address educational gaps, enhance student outcomes, and significantly boost student success across disciplines at the University. Pre-processing steps take place to establish a target variable, anonymize student information, manage missing data, and identify the most significant features. Given the mixed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications

MethodsDropout · Logistic Regression