An Approach to Detect Abnormal Submissions for CodeWorkout Dataset

Alex Hicks; Yang Shi; Arun-Balajiee Lekshmi-Narayanan; Wei Yan; Samiha; Marwan

arXiv:2407.17475·cs.CY·July 26, 2024

An Approach to Detect Abnormal Submissions for CodeWorkout Dataset

Alex Hicks, Yang Shi, Arun-Balajiee Lekshmi-Narayanan, Wei Yan, Samiha, Marwan

PDF

TL;DR

This paper proposes a preliminary approach to identify abnormal student log data in programming learning environments, aiming to improve personalized recommendation systems by detecting anomalies beyond traditional plagiarism methods.

Contribution

It introduces a novel method to detect various types of abnormal submissions in student log data, addressing limitations of existing plagiarism detection tools.

Findings

01

Initial analysis of abnormal log data patterns

02

Potential to enhance recommendation accuracy by filtering anomalies

03

Framework for future development of anomaly detection methods

Abstract

Students interactions while solving problems in learning environments (i.e. log data) are often used to support students learning. For example, researchers use log data to develop systems that can provide students with personalized problem recommendations based on their knowledge level. However, anomalies in the students log data, such as cheating to solve programming problems, could introduce a hidden bias in the log data. As a result, these systems may provide inaccurate problem recommendations, and therefore, defeat their purpose. Classical cheating detection methods, such as MOSS, can be used to detect code plagiarism. However, these methods cannot detect other abnormal events such as a student gaming a system with multiple attempts of similar solutions to a particular programming problem. This paper presents a preliminary study to analyze log data with anomalies. The goal of our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.