Interpretability from the Ground Up: Stakeholder-Centric Design of Automated Scoring in Educational Assessments

Yunsung Kim; Mike Hardy; Joseph Tey; Candace Thille; Chris Piech

arXiv:2511.17069·cs.CL·April 23, 2026

Interpretability from the Ground Up: Stakeholder-Centric Design of Automated Scoring in Educational Assessments

Yunsung Kim, Mike Hardy, Joseph Tey, Candace Thille, Chris Piech

PDF

TL;DR

This paper proposes a stakeholder-centric approach to interpretability in automated educational scoring, introducing four principles and a reference framework that improve transparency without sacrificing accuracy.

Contribution

It develops four interpretability principles and the AnalyticScore framework, advancing transparent automated scoring aligned with stakeholder needs.

Findings

01

AnalyticScore outperforms many uninterpretable methods in accuracy.

02

Within 0.06 QWK of state-of-the-art on 10 items from ASAP-SAS.

03

Featurization behavior aligns well with human annotators.

Abstract

AI-driven automated scoring systems offer scalable and efficient means of evaluating complex student-generated responses. Yet, despite increasing demand for transparency and interpretability, the field has yet to develop a widely accepted solution for interpretable automated scoring to be used in large-scale real-world assessments. This work takes a principled approach to address this challenge. We analyze the needs and potential benefits of interpretable automated scoring for various assessment stakeholder groups and develop four principles of interpretability -- (F)aithfulness, (G)roundedness, (T)raceability, and (I)nterchangeability (FGTI) -- targeted at those needs. To illustrate the feasibility of implementing these principles, we develop the AnalyticScore framework as a reference framework. When applied to the domain of text-based constructed-response scoring, AnalyticScore…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.