Predicting Long-Term Student Outcomes from Short-Term EdTech Log Data
Ge Gao, Amelia Leon, Andrea Jetten, Jasmine Turner, Husni Almoubayyed,, Stephen Fancsali, Emma Brunskill

TL;DR
This paper demonstrates that machine learning models trained on the first few hours of student interaction data can effectively predict long-term academic outcomes, aiding early intervention efforts.
Contribution
It introduces a novel approach of using short-term EdTech log data to predict end-of-year student performance across diverse datasets.
Findings
Short-term log data (2-5 hours) predicts long-term outcomes effectively.
Predictive models can identify students at various performance levels.
Early usage data provides valuable signals for educational assessment.
Abstract
Educational stakeholders are often particularly interested in sparse, delayed student outcomes, like end-of-year statewide exams. The rare occurrence of such assessments makes it harder to identify students likely to fail such assessments, as well as making it slow for researchers and educators to be able to assess the effectiveness of particular educational tools. Prior work has primarily focused on using logs from students full usage (e.g. year-long) of an educational product to predict outcomes, or considered predictive accuracy using a few minutes to predict outcomes after a short (e.g. 1 hour) session. In contrast, we investigate machine learning predictors using students' logs during their first few hours of usage can provide useful predictive insight into those students' end-of-school year external assessment. We do this on three diverse datasets: from students in Uganda using a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
