Clustering Introductory Computer Science Exercises Using Topic Modeling   Methods

Laura O. Moraes; Carlos Eduardo Pedreira

arXiv:2104.10748·cs.LG·April 23, 2021

Clustering Introductory Computer Science Exercises Using Topic Modeling Methods

Laura O. Moraes, Carlos Eduardo Pedreira

PDF

1 Repo

TL;DR

This paper explores the use of topic modeling techniques to automatically cluster introductory computer science exercises, transforming code solutions into text and validating the semantic coherence of the resulting clusters.

Contribution

It introduces a novel method combining code structure analysis with topic modeling to identify meaningful question clusters in computer science education.

Findings

01

Six semantically coherent clusters identified

02

Achieved 0.75 NPMI score indicating strong semantic coherence

03

Results correlate with human expert ratings

Abstract

Manually determining concepts present in a group of questions is a challenging and time-consuming process. However, the process is an essential step while modeling a virtual learning environment since a mapping between concepts and questions using mastery level assessment and recommendation engines are required. We investigated unsupervised semantic models (known as topic modeling techniques) to assist computer science teachers in this task and propose a method to transform Computer Science 1 teacher-provided code solutions into representative text documents, including the code structure information. By applying non-negative matrix factorization and latent Dirichlet allocation techniques, we extract the underlying relationship between questions and validate the results using an external dataset. We consider the interpretability of the learned concepts using 14 university professors'…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

laura-moraes/machine-teaching
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.