Deep Learning for School Dropout Detection: A Comparison of Tabular and Graph-Based Models for Predicting At-Risk Students

Pablo G. Almeida; Guilherme A. L. Silva; Val\'eria Santos; Gladston Moreira; Pedro Silva; Eduardo Luz

arXiv:2508.14057·cs.LG·January 16, 2026

Deep Learning for School Dropout Detection: A Comparison of Tabular and Graph-Based Models for Predicting At-Risk Students

Pablo G. Almeida, Guilherme A. L. Silva, Val\'eria Santos, Gladston Moreira, Pedro Silva, Eduardo Luz

PDF

TL;DR

This study compares traditional tabular machine learning models with graph neural networks for predicting student dropout, demonstrating that specific graph-based approaches can outperform established models when data is effectively structured as graphs.

Contribution

It introduces a novel approach of transforming tabular student data into graphs using clustering, and evaluates the impact of different GNN architectures and graph construction methods on dropout prediction accuracy.

Findings

01

GraphSAGE on PCA-KMeans graphs outperforms XGBoost by 7% in macro F1-score.

02

Other GNN configurations did not consistently outperform tabular models.

03

Graph construction strategy significantly influences GNN performance.

Abstract

Student dropout is a significant challenge in educational systems worldwide, leading to substantial social and economic costs. Predicting students at risk of dropout allows for timely interventions. While traditional Machine Learning (ML) models operating on tabular data have shown promise, Graph Neural Networks (GNNs) offer a potential advantage by capturing complex relationships inherent in student data if structured as graphs. This paper investigates whether transforming tabular student data into graph structures, primarily using clustering techniques, enhances dropout prediction accuracy. We compare the performance of GNNs (a custom Graph Convolutional Network (GCN) and GraphSAGE) on these generated graphs against established tabular models (Random Forest (RF), XGBoost, and TabNet) using a real-world student dataset. Our experiments explore various graph construction strategies…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.