Content vs. Form: What Drives the Writing Score Gap Across Socioeconomic Backgrounds? A Generated Panel Approach
Nadav Kunievsky, Pedro Pertusi

TL;DR
This study uses large language models to disentangle content and style in writing assessments, revealing that most SES score gaps are due to content differences, with style playing a smaller role.
Contribution
Introduces a novel method leveraging language models to generate stylistic variants of essays, enabling decomposition of SES gaps into content and style contributions.
Findings
69% of SES gap due to content quality
26% of gap attributable to style differences
5% of gap due to evaluation standards
Abstract
Students from different socioeconomic backgrounds exhibit persistent gaps in test scores, gaps that can translate into unequal educational and labor-market outcomes later in life. In many assessments, performance reflects not only what students know, but also how effectively they can communicate that knowledge. This distinction is especially salient in writing assessments, where scores jointly reward the substance of students' ideas and the way those ideas are expressed. As a result, observed score gaps may conflate differences in underlying content with differences in expressive skill. A central question, therefore, is how much of the socioeconomic-status (SES) gap in scores is driven by differences in what students say versus how they say it. We study this question using a large corpus of persuasive essays written by U.S. middle- and high-school students. We introduce a new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsWriting and Handwriting Education · Educational Strategies and Epistemologies · Mental Health via Writing
