Towards Feature Engineering at Scale for Data from Massive Open Online Courses
Kalyan Veeramachaneni, Una-May O'Reilly, Colin Taylor

TL;DR
This paper explores scalable feature engineering for MOOC data by involving crowdsourcing to generate and evaluate features, leading to nuanced insights into learner behavior and improved stopout prediction models.
Contribution
It introduces crowd-based approaches for feature proposal and evaluation in MOOC data, enhancing the understanding of learner interactions and improving predictive models.
Findings
Crowd-sourced features are nuanced and consider multiple interaction modes.
Different influential features depend on learner engagement levels.
Engaging the crowd can improve feature relevance and model accuracy.
Abstract
We examine the process of engineering features for developing models that improve our understanding of learners' online behavior in MOOCs. Because feature engineering relies so heavily on human insight, we argue that extra effort should be made to engage the crowd for feature proposals and even their operationalization. We show two approaches where we have started to engage the crowd. We also show how features can be evaluated for their relevance in predictive accuracy. When we examined crowd-sourced features in the context of predicting stopout, not only were they nuanced, but they also considered more than one interaction mode between the learner and platform and how the learner was relatively performing. We were able to identify different influential features for stop out prediction that depended on whether a learner was in 1 of 4 cohorts defined by their level of engagement with the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOnline Learning and Analytics · Advancements in Semiconductor Devices and Circuit Design
