Exploring the Effectiveness of Using LLMs for Automated Assessment of Student Self Explanations in Programming Education

Arun-Balajiee Lekshmi-Narayanan; Mohammad Hassany; Peter Brusilovsky

arXiv:2605.21614·cs.HC·May 22, 2026

Exploring the Effectiveness of Using LLMs for Automated Assessment of Student Self Explanations in Programming Education

Arun-Balajiee Lekshmi-Narayanan, Mohammad Hassany, Peter Brusilovsky

PDF

TL;DR

This paper compares the effectiveness of LLMs and semantic similarity methods for automatically scoring student self-explanations in programming education, addressing a key challenge in automated assessment.

Contribution

It provides a rigorous comparison between LLM-based scoring and semantic similarity approaches for binary classification of student explanations.

Findings

01

LLMs outperform semantic similarity in scoring accuracy

02

The study highlights the importance of dataset quality for automated scoring methods

03

Results suggest LLMs are more effective for assessing student explanations

Abstract

Worked examples are step-by-step solutions to problems in a specific domain, offered to students to acquire domain-specific problem-solving skills. The effectiveness of worked examples could be enhanced by combining them with self-explanations, which ask students to explain rather than passively study each problem-solving step. The main challenge of this approach is assessing the correctness of the student's explanations. In the prevailing approach, student explanations are judged by their semantic similarity to an instructor's or domain expert's explanation. Given recent advances in LLM-based automated scoring, it remains unclear whether semantic similarity methods are still the most effective technique to automatically score textual student responses like essays or code explanations. Comparing these methods also requires quality datasets that offer distinctive features such as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.