Understanding Teacher Revisions of Large Language Model-Generated Feedback
Conrad Borchers, Luiz Rodrigues, Newarney Torrez\~ao da Costa, Cleon Xavier, Rafael Ferreira Mello

TL;DR
This study investigates how teachers revise AI-generated feedback, revealing patterns in editing behavior, predictability of revisions, and shifts in pedagogical focus, to improve AI classroom tools.
Contribution
It provides empirical insights into teacher revision practices of AI feedback, models to predict revisions, and analysis of pedagogical shifts, informing better feedback system design.
Findings
Teachers accept AI feedback without modification in 80% of cases.
Machine learning models predict revisions with AUC=0.75.
Revisions often simplify feedback, shifting to concise, corrective forms.
Abstract
Large language models (LLMs) increasingly generate formative feedback for students, yet little is known about how teachers revise this feedback before it reaches learners. Teachers' revisions shape what students receive, making revision practices central to evaluating AI classroom tools. We analyze a dataset of 1,349 instances of AI-generated feedback and corresponding teacher-edited explanations from 117 teachers. We examine (i) textual characteristics associated with teacher revisions, (ii) whether revision decisions can be predicted from the AI feedback text, and (iii) how revisions change the pedagogical type of feedback delivered. First, we find that teachers accept AI feedback without modification in about 80% of cases, while edited feedback tends to be significantly longer and subsequently shortened by teachers. Editing behavior varies substantially across teachers: about 50%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
