Learning Code-Edit Embedding to Model Student Debugging Behavior

Hasnain Heickal; Andrew Lan

arXiv:2502.19407·cs.SE·May 1, 2025

Learning Code-Edit Embedding to Model Student Debugging Behavior

Hasnain Heickal, Andrew Lan

PDF

Open Access 1 Repo

TL;DR

This paper introduces a model that learns code-edit embeddings from student submissions to better understand debugging behavior, improve personalized feedback, and suggest next steps in programming education.

Contribution

It presents a novel encoder-decoder model fine-tuned with test case information to capture code editing patterns and support personalized feedback in student programming.

Findings

01

Model achieves high accuracy in code reconstruction.

02

Enables effective personalized code suggestions.

03

Reveals common debugging behaviors through clustering.

Abstract

Providing effective feedback for programming assignments in computer science education can be challenging: students solve problems by iteratively submitting code, executing it, and using limited feedback from the compiler or the auto-grader to debug. Analyzing student debugging behavior in this process may reveal important insights into their knowledge and inform better personalized support tools. In this work, we propose an encoder-decoder-based model that learns meaningful code-edit embeddings between consecutive student code submissions, to capture their debugging behavior. Our model leverages information on whether a student code submission passes each test case to fine-tune large language models (LLMs) to learn code editing representations. It enables personalized next-step code suggestions that maintain the student's coding style while improving test case correctness. Our model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

umass-ml4ed/code-edit-representation
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTeaching and Learning Programming · Software Engineering Research · Software Testing and Debugging Techniques