A CEFR-Inspired Classification Framework with Fuzzy C-Means To Automate Assessment of Programming Skills in Scratch
Ricardo Hidalgo-Arag\'on, Jes\'us M. Gonz\'alez-Barahona, Gregorio Robles

TL;DR
This paper presents a CEFR-aligned framework using Fuzzy C-Means clustering to automate Scratch programming assessment, providing transparent, scalable, and actionable insights for educators and learners.
Contribution
It introduces a novel clustering-based assessment method for Scratch aligned with CEFR, including metrics for transitional learners and progress tracking.
Findings
Identified a 'B2 bottleneck' with only 13.3% of learners reaching that level.
Developed metrics to quantify classification certainty and support human-in-the-loop review.
Enabled systemic curriculum gap diagnosis through automated assessment.
Abstract
Context: Schools, training platforms, and technology firms increasingly need to assess programming proficiency at scale with transparent, reproducible methods that support personalized learning pathways. Objective: This study introduces a pedagogical framework for Scratch project assessment, aligned with the Common European Framework of Reference (CEFR), providing universal competency levels for students and teachers alongside actionable insights for curriculum design. Method: We apply Fuzzy C-Means clustering to 2008246 Scratch projects evaluated via Dr.Scratch, implementing an ordinal criterion to map clusters to CEFR levels (A1-C2), and introducing enhanced classification metrics that identify transitional learners, enable continuous progress tracking, and quantify classification certainty to balance automated feedback with instructor review. Impact: The framework enables diagnosis…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
