Evaluating Predictive Models of Student Success: Closing the Methodological Gap
Josh Gardner, Christopher Brooks

TL;DR
This paper critically examines current practices in evaluating predictive models in learning analytics, highlighting the limitations of naive methods and demonstrating the benefits of Bayesian approaches through a case study on MOOC data.
Contribution
It provides a comprehensive comparison of naive, NHST, and Bayesian evaluation methods, advocating for more rigorous evaluation practices in learning analytics.
Findings
Naive evaluation methods often lead to misleading conclusions.
Bayesian evaluation offers more reliable insights into model performance.
Different evaluation techniques can significantly alter experimental conclusions.
Abstract
Model evaluation -- the process of making inferences about the performance of predictive models -- is a critical component of predictive modeling research in learning analytics. We survey the state of the practice with respect to model evaluation in learning analytics, which overwhelmingly uses only naive methods for model evaluation or statistical tests which are not appropriate for predictive model evaluation. We conduct a critical comparison of both null hypothesis significance testing (NHST) and a preferred Bayesian method for model evaluation. Finally, we apply three methods -- the na{\"i}ve average commonly used in learning analytics, NHST, and Bayesian -- to a predictive modeling experiment on a large set of MOOC data. We compare 96 different predictive models, including different feature sets, statistical modeling algorithms, and tuning hyperparameters for each, using this case…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
