Estimating Difficulty Levels of Programming Problems with Pre-trained   Model

Zhiyuan Wang; Wei Zhang; Jun Wang

arXiv:2406.08828·cs.SE·June 14, 2024

Estimating Difficulty Levels of Programming Problems with Pre-trained Model

Zhiyuan Wang, Wei Zhang, Jun Wang

PDF

Open Access

TL;DR

This paper introduces a method to automatically estimate the difficulty levels of programming problems using a combined pre-trained text and code model, reducing reliance on expert annotations and student solution data.

Contribution

It proposes a novel approach coupling pre-trained text and code models for difficulty estimation and provides two new datasets for this task.

Findings

01

The combined model effectively estimates problem difficulty.

02

Both text and code modalities contribute significantly to accuracy.

03

The approach reduces the need for extensive expert annotation.

Abstract

As the demand for programming skills grows across industries and academia, students often turn to Programming Online Judge (POJ) platforms for coding practice and competition. The difficulty level of each programming problem serves as an essential reference for guiding students' adaptive learning. However, current methods of determining difficulty levels either require extensive expert annotations or take a long time to accumulate enough student solutions for each problem. To address this issue, we formulate the problem of automatic difficulty level estimation of each programming problem, given its textual description and a solution example of code. For tackling this problem, we propose to couple two pre-trained models, one for text modality and the other for code modality, into a unified model. We built two POJ datasets for the task and the results demonstrate the effectiveness of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTeaching and Learning Programming · Machine Learning and Data Classification