Modelling higher education dropouts using sparse and interpretable post-clustering logistic regression
Andrea Nigri, Massimo Bilancia, Barbara Cafarelli, Samuele Magro

TL;DR
This paper introduces a sparse, interpretable logistic regression model with clustering to identify and characterize student subgroups at risk of dropout, enhancing understanding and policy relevance.
Contribution
It extends traditional logistic regression by integrating clustering and sparsity, enabling unsupervised subgroup identification and interpretability in dropout analysis.
Findings
Effective identification of student subgroups at risk
Enhanced interpretability through sparsity and clustering
Successful application to Italian university data
Abstract
Higher education dropout constitutes a critical challenge for tertiary education systems worldwide. While machine learning techniques can achieve high predictive accuracy on selected datasets, their adoption by policymakers remains limited and unsatisfactory, particularly when the objective is the unsupervised identification and characterization of student subgroups at elevated risk of dropout. The model introduced in this paper is a specialized form of logistic regression, specifically adapted to the context of university dropout analysis. Logistic regression continues to serve as a foundational tool among reliable statistical models, primarily due to the ease with which its parameters can be interpreted in terms of odds ratios. Our approach significantly extends this framework by incorporating heterogeneity within the student population. This is achieved through the application of a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOnline Learning and Analytics · Higher Education Research Studies · Advanced Clustering Algorithms Research
MethodsDropout · Logistic Regression
