Building Data Science Capabilities into University Data Warehouse to Predict Graduation
Joonas Pesonen, Anna Fomkin, Lauri Jokipii

TL;DR
This paper describes integrating data science capabilities into a university data warehouse to predict student graduation outcomes, involving infrastructure enhancements and a pilot study using student registry data.
Contribution
It introduces a novel approach to embed predictive analytics within university data infrastructure for real-time student outcome predictions.
Findings
Successful integration of data science lab into data warehouse
Pilot model predicting graduation probability and time-to-degree
Identified ethical and legal considerations for deployment
Abstract
The discipline of data science emerged to combine statistical methods with computing. At Aalto University, Finland, we have taken first steps to bring educational data science as a part of daily operations of Management Information Services. This required changes in IT environment: we enhanced data warehouse infrastructure with a data science lab, where we can read predictive model training data from data warehouse database and use the created predictive models in database queries. We then conducted a data science pilot with an objective to predict students' graduation probability and time-to-degree with student registry data. Further ethical and legal considerations are needed before using predictions in daily operations of the university.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
