From Prediction to Application: Language Model-based Code Knowledge   Tracing with Domain Adaptive Pre-Training and Automatic Feedback System with   Pedagogical Prompting for Comprehensive Programming Education

Unggi Lee; Jiyeong Bae; Yeonji Jung; Minji Kang; Gyuri Byun; Yeonseo; Lee; Dohee Kim; Sookbun Lee; Jaekwon Park; Taekyung Ahn; Gunho Lee,; Hyeoncheol Kim

arXiv:2409.00323·cs.CL·September 4, 2024

From Prediction to Application: Language Model-based Code Knowledge Tracing with Domain Adaptive Pre-Training and Automatic Feedback System with Pedagogical Prompting for Comprehensive Programming Education

Unggi Lee, Jiyeong Bae, Yeonji Jung, Minji Kang, Gyuri Byun, Yeonseo, Lee, Dohee Kim, Sookbun Lee, Jaekwon Park, Taekyung Ahn, Gunho Lee,, Hyeoncheol Kim

PDF

TL;DR

This paper introduces CodeLKT, a language model-based approach to programming knowledge tracing, enhanced by domain adaptive pre-training and an automatic feedback system, significantly improving interpretability and cross-domain transfer in programming education.

Contribution

It proposes CodeLKT, integrating language models with adaptive pre-training and feedback mechanisms, advancing programming knowledge tracing beyond traditional methods.

Findings

01

CodeLKT outperforms existing KT models in accuracy.

02

Domain adaptive pre-training enhances cross-domain transfer.

03

The integrated system provides personalized, in-depth feedback for learners.

Abstract

Knowledge Tracing (KT) is a critical component in online learning, but traditional approaches face limitations in interpretability and cross-domain adaptability. This paper introduces Language Model-based Code Knowledge Tracing (CodeLKT), an innovative application of Language model-based Knowledge Tracing (LKT) to programming education. CodeLKT leverages pre-trained language models to process learning data, demonstrating superior performance over existing KT and Code KT models. We explore Domain Adaptive Pre-Training (DAPT) and Task Adaptive Pre-Training (TAPT), showing enhanced performance in the coding domain and investigating cross-domain transfer between mathematics and coding. Additionally, we present an theoretically-informed integrated system combining CodeLKT with large language models to generate personalized, in-depth feedback to support students' programming learning. This…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsBalanced Selection