An empirical comparison of machine learning models for student's mental health illness assessment
Prathamesh Muzumdar, Ganga Prasad Basyal, and Piyush Vyas

TL;DR
This study empirically compares multiple machine learning models to assess their effectiveness in classifying students' mental health issues, highlighting XGBoost's superior performance and identifying key social and environmental factors.
Contribution
It evaluates various ML models on educational data for mental health classification and identifies important features influencing student mental health.
Findings
XGBoost outperforms other models in classification accuracy
Social support, learning environment, and childhood adversities are key factors
Tree-based models show significant predictive performance
Abstract
Student's mental health problems have been explored previously in higher education literature in various contexts including empirical work involving quantitative and qualitative methods. Nevertheless, comparatively few research could be found, aiming for computational methods that learn information directly from data without relying on set parameters for a predetermined equation as an analytical method. This study aims to investigate the performance of Machine learning (ML) models used in higher education. ML models considered are Naive Bayes, Support Vector Machine, K-Nearest Neighbor, Logistic regression, Stochastic Gradient Descent, Decision Tree, Random Forest, XGBoost (Extreme Gradient Boosting Decision Tree), and NGBoost (Natural) algorithm. Considering the factors of mental health illness among students, we follow three phases of data processing: segmentation, feature extraction,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
