A CEFR-Inspired Classification Framework with Fuzzy C-Means To Automate Assessment of Programming Skills in Scratch

Ricardo Hidalgo-Arag\'on; Jes\'us M. Gonz\'alez-Barahona; Gregorio Robles

arXiv:2604.00730·cs.CY·April 2, 2026

A CEFR-Inspired Classification Framework with Fuzzy C-Means To Automate Assessment of Programming Skills in Scratch

Ricardo Hidalgo-Arag\'on, Jes\'us M. Gonz\'alez-Barahona, Gregorio Robles

PDF

TL;DR

This paper presents a CEFR-aligned framework using Fuzzy C-Means clustering to automate Scratch programming assessment, providing transparent, scalable, and actionable insights for educators and learners.

Contribution

It introduces a novel clustering-based assessment method for Scratch aligned with CEFR, including metrics for transitional learners and progress tracking.

Findings

01

Identified a 'B2 bottleneck' with only 13.3% of learners reaching that level.

02

Developed metrics to quantify classification certainty and support human-in-the-loop review.

03

Enabled systemic curriculum gap diagnosis through automated assessment.

Abstract

Context: Schools, training platforms, and technology firms increasingly need to assess programming proficiency at scale with transparent, reproducible methods that support personalized learning pathways. Objective: This study introduces a pedagogical framework for Scratch project assessment, aligned with the Common European Framework of Reference (CEFR), providing universal competency levels for students and teachers alongside actionable insights for curriculum design. Method: We apply Fuzzy C-Means clustering to 2008246 Scratch projects evaluated via Dr.Scratch, implementing an ordinal criterion to map clusters to CEFR levels (A1-C2), and introducing enhanced classification metrics that identify transitional learners, enable continuous progress tracking, and quantify classification certainty to balance automated feedback with instructor review. Impact: The framework enables diagnosis…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.