Modeling Student Performance in Game-Based Learning Environments
Hyunbae Jeon, Harry He, Anthony Wang, Susanna Spooner

TL;DR
This paper demonstrates that proper data preprocessing significantly improves machine learning models' ability to predict student performance in game-based learning, with the MLP model achieving high accuracy and F1 scores.
Contribution
It introduces effective data aggregation methods and benchmarks for predicting student performance, outperforming existing models in educational game analysis.
Findings
MLP outperformed state-of-the-art models with F1 score of 0.83
Proper preprocessing reduced data size from 4.6 GB to 48 MB
Preprocessing techniques enhanced non-deep-learning model performance
Abstract
This study investigates game-based learning in the context of the educational game "Jo Wilder and the Capitol Case," focusing on predicting student performance using various machine learning models, including K-Nearest Neighbors (KNN), Multi-Layer Perceptron (MLP), and Random Forest. The research aims to identify the features most predictive of student performance and correct question answering. By leveraging gameplay data, we establish complete benchmarks for these models and explore the importance of applying proper data aggregation methods. By compressing all numeric data to min/max/mean/sum and categorical data to first, last, count, and nunique, we reduced the size of the original training data from 4.6 GB to 48 MB of preprocessed training data, maintaining high F1 scores and accuracy. Our findings suggest that proper preprocessing techniques can be vital in enhancing the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEducational Games and Gamification · Online Learning and Analytics · Intelligent Tutoring Systems and Adaptive Learning
