Centralized vs. Federated Learning for Educational Data Mining: A Comparative Study on Student Performance Prediction with SAEB Microdata
Rodrigo Tertulino

TL;DR
This study compares centralized and federated learning approaches for predicting student performance using Brazilian microdata, demonstrating that federated learning offers a privacy-preserving alternative with only slight accuracy reduction.
Contribution
It provides the first comprehensive evaluation of federated learning for large-scale educational data in Brazil, highlighting its viability under privacy legislation.
Findings
Federated model achieved 61.23% accuracy, close to the centralized model's 63.96%.
Federated learning offers a privacy-preserving alternative with minimal performance loss.
Study demonstrates feasibility of federated learning for large-scale educational data in Brazil.
Abstract
The application of data mining and artificial intelligence in education offers unprecedented potential for personalizing learning and early identification of at-risk students. However, the practical use of these techniques faces a significant barrier in privacy legislation, such as Brazil's General Data Protection Law (LGPD), which restricts the centralization of sensitive student data. To resolve this challenge, privacy-preserving computational approaches are required. The present study evaluates the feasibility and effectiveness of Federated Learning, specifically the FedProx algorithm, to predict student performance using microdata from the Brazilian Basic Education Assessment System (SAEB). A Deep Neural Network (DNN) model was trained in a federated manner, simulating a scenario with 50 schools, and its performance was rigorously benchmarked against a centralized eXtreme Gradient…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOnline Learning and Analytics · Big Data and Digital Economy · Educational Assessment and Improvement
