Automatic Identification of Ineffective Online Student Questions in Computing Education
Qiang Hao, April Galyardt, Bradley Barnes, Robert Maribe Branch, Ewan, Wright

TL;DR
This paper develops machine learning methods to automatically identify ineffective questions in online computer science education, enabling large-scale automated feedback and question improvement.
Contribution
It introduces a classification framework for assessing question effectiveness and evaluates multiple algorithms, demonstrating feasibility for automated educational support.
Findings
Support Vector Machines achieved high classification accuracy.
Manual classification reliability was high with Cohen's Kappa of .88.
Automated methods can effectively identify ineffective questions.
Abstract
This Research Full Paper explores automatic identification of ineffective learning questions in the context of large-scale computer science classes. The immediate and accurate identification of ineffective learning questions opens the door to possible automated facilitation on a large scale, such as alerting learners to revise questions and providing adaptive question revision suggestions. To achieve this, 983 questions were collected from a question & answer platform implemented by an introductory programming course over three semesters in a large research university in the Southeastern United States. Questions were firstly manually classified into three hierarchical categories: 1) learning-irrelevant questions, 2) effective learning-relevant questions, 3) ineffective learningrelevant questions. The inter-rater reliability of the manual classification (Cohen's Kappa) was .88. Four…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLogistic Regression
