Automated Knowledge Component Generation for Interpretable Knowledge Tracing in Coding Problems

Zhangqi Duan; Nigel Fernandez; Arun Balajiee Lekshmi Narayanan; Mohammad Hassany; Rafaella Sampaio de Alencar; Peter Brusilovsky; Bita Akram; Andrew Lan

arXiv:2502.18632·cs.AI·May 19, 2026

Automated Knowledge Component Generation for Interpretable Knowledge Tracing in Coding Problems

Zhangqi Duan, Nigel Fernandez, Arun Balajiee Lekshmi Narayanan, Mohammad Hassany, Rafaella Sampaio de Alencar, Peter Brusilovsky, Bita Akram, Andrew Lan

PDF

1 Repo

TL;DR

This paper introduces an automated pipeline using large language models to generate and tag knowledge components for programming problems, improving interpretability and prediction accuracy in knowledge tracing.

Contribution

It presents a novel LLM-based method for automatic KC generation and tagging, reducing manual effort and enhancing student response prediction in coding education.

Findings

01

KCGen-KT outperforms existing methods and human KCs in response prediction.

02

Generated KCs better fit cognitive models than human-written KCs.

03

Human evaluation confirms the accuracy of problem-KC mappings.

Abstract

Knowledge components (KCs) mapped to problems help model student learning, tracking their mastery levels on fine-grained skills thereby facilitating personalized learning and feedback in online learning platforms. However, crafting and tagging KCs to problems, traditionally performed by human domain experts, is highly labor intensive. We present an automated, LLM-based pipeline for KC generation and tagging for open-ended programming problems. We also develop an LLM-based knowledge tracing (KT) framework to leverage these LLM-generated KCs, which we refer to as KCGen-KT. We conduct extensive quantitative and qualitative evaluations on two real-world student code submission datasets in different programming languages.We find that KCGen-KT outperforms existing KT methods and human-written KCs on future student response prediction. We investigate the learning curves of generated KCs and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

umass-ml4ed/kcgen-kt
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIntelligent Tutoring Systems and Adaptive Learning · Advanced Graph Neural Networks · Innovative Teaching and Learning Methods