A Knowledge-Component-Based Methodology for Evaluating AI Assistants
Laryn Qi, J.D. Zamfirescu-Pereira, Taehan Kim, Bj\"orn Hartmann, John, DeNero, Narges Norouzi

TL;DR
This paper presents a methodology using knowledge components to evaluate the effectiveness of GPT-4 generated hints in helping CS1 students improve their programming solutions, demonstrating that targeted hints accelerate learning.
Contribution
It introduces a knowledge-component-based framework for evaluating AI-generated hints, linking hints to specific student errors and showing their impact on learning progress.
Findings
Hints help students fix code more quickly
Hints accurately identify key errors in student solutions
Multi-issue hints lead to better student progress
Abstract
We evaluate an automatic hint generator for CS1 programming assignments powered by GPT-4, a large language model. This system provides natural language guidance about how students can improve their incorrect solutions to short programming exercises. A hint can be requested each time a student fails a test case. Our evaluation addresses three Research Questions: RQ1: Do the hints help students improve their code? RQ2: How effectively do the hints capture problems in student code? RQ3: Are the issues that students resolve the same as the issues addressed in the hints? To address these research questions quantitatively, we identified a set of fine-grained knowledge components and determined which ones apply to each exercise, incorrect solution, and generated hint. Comparing data from two large CS1 offerings, we found that access to the hints helps students to address problems with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI)
MethodsAttention Is All You Need · Sparse Evolutionary Training · Softmax · Layer Normalization · Linear Layer · Byte Pair Encoding · Label Smoothing · Adam · Residual Connection · Multi-Head Attention
