Chatbot-Based Assessment of Code Understanding in Automated Programming Assessment Systems
Eduard Frankford, Erik Cikalleshi, Ruth Breu

TL;DR
This paper reviews conversational assessment approaches in programming education and proposes a Hybrid Socratic Framework that combines deterministic analysis with conversational verification to better assess code understanding.
Contribution
It introduces a comprehensive review of conversational assessment architectures and presents a novel hybrid framework for verifying student understanding in automated programming assessments.
Findings
Conversational agents show promise for scalable feedback and probing code understanding.
Limitations include hallucinations, privacy issues, and deployment constraints.
The proposed framework integrates deterministic analysis with conversational verification.
Abstract
Large Language Models (LLMs) challenge conventional automated programming assessment because students can now produce functionally correct code without demonstrating corresponding understanding. This paper makes two contributions. First, it reports a saturation-based scoping review of conversational assessment approaches in programming education. The review identifies three dominant architectural families: rule-based or template-driven systems, LLM-based systems, and hybrid systems. Across the literature, conversational agents appear promising for scalable feedback and deeper probing of code understanding, but important limitations remain around hallucinations, over-reliance, privacy, integrity, and deployment constraints. Second, the paper synthesizes these findings into a Hybrid Socratic Framework for integrating conversational verification into Automated Programming Assessment…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
