Programming Knowledge Tracing: A Comprehensive Dataset and A New Model
Renyu Zhu, Dongxiang Zhang, Chengcheng Han, Ming Gao, Xuesong Lu,, Weining Qian, Aoying Zhou

TL;DR
This paper introduces a comprehensive programming education dataset, BePKT, and a novel model PDKT that leverages enriched context for improved knowledge tracing accuracy, demonstrating state-of-the-art results.
Contribution
The paper provides the largest dataset for programming knowledge tracing and proposes a new model PDKT that combines advanced code embedding and feature fusion techniques.
Findings
PDKT achieves state-of-the-art performance on BePKT.
PLCodeBERT enhances code-related task accuracy.
Enriched context improves student behavior prediction.
Abstract
In this paper, we study knowledge tracing in the domain of programming education and make two important contributions. First, we harvest and publish so far the most comprehensive dataset, namely BePKT, which covers various online behaviors in an OJ system, including programming text problems, knowledge annotations, user-submitted code and system-logged events. Second, we propose a new model PDKT to exploit the enriched context for accurate student behavior prediction. More specifically, we construct a bipartite graph for programming problem embedding, and design an improved pre-training model PLCodeBERT for code embedding, as well as a double-sequence RNN model with exponential decay attention for effective feature fusion. Experimental results on the new dataset BePKT show that our proposed model establishes state-of-the-art performance in programming knowledge tracing. In addition, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOnline Learning and Analytics · Software Engineering Research · Intelligent Tutoring Systems and Adaptive Learning
MethodsExponential Decay
