Modeling the EdNet Dataset with Logistic Regression
Philip I. Pavlik Jr, Luke G. Eglington

TL;DR
This paper explores the application of logistic regression to model the EdNet dataset, highlighting its interpretability and practical utility in educational data mining compared to complex neural networks.
Contribution
The paper presents a case for using logistic regression in educational data mining, emphasizing interpretability and practical decision-making over black-box neural models.
Findings
Logistic regression provides transparent learner predictions.
Model accuracy alone may not improve pedagogical decisions.
Simple decision rules can limit the utility of complex models.
Abstract
Many of these challenges are won by neural network models created by full-time artificial intelligence scientists. Due to this origin, they have a black-box character that makes their use and application less clear to learning scientists. We describe our experience with competition from the perspective of educational data mining, a field founded in the learning sciences and connected with roots in psychology and statistics. We describe our efforts from the perspectives of learning scientists and the challenges to our methods, some real and some imagined. We also discuss some basic results in the Kaggle system and our thoughts on how those results may have been improved. Finally, we describe how learner model predictions are used to make pedagogical decisions for students. Their practical use entails a) model predictions and b) a decision rule (based on the predictions). We point out how…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOnline Learning and Analytics · Intelligent Tutoring Systems and Adaptive Learning · Machine Learning and Data Classification
