Annotation and Classification of Sentence-level Revision Improvement

Tazin Afrin; Diane Litman

arXiv:1909.05309·cs.CL·September 13, 2019

Annotation and Classification of Sentence-level Revision Improvement

Tazin Afrin, Diane Litman

PDF

TL;DR

This paper introduces a new annotated corpus of student essay revisions, demonstrating how machine learning can predict revision quality and showing that combining expert and non-expert data improves model accuracy.

Contribution

The paper presents a novel annotated dataset of revision quality and a machine learning approach that leverages both expert and non-expert revisions for better prediction.

Findings

01

Blended expert and non-expert revisions improve model performance

02

Expert data is crucial for predicting low-quality revisions

03

The corpus enables future research on revision quality assessment

Abstract

Studies of writing revisions rarely focus on revision quality. To address this issue, we introduce a corpus of between-draft revisions of student argumentative essays, annotated as to whether each revision improves essay quality. We demonstrate a potential usage of our annotations by developing a machine learning model to predict revision improvement. With the goal of expanding training data, we also extract revisions from a dataset edited by expert proofreaders. Our results indicate that blending expert and non-expert revisions increases model performance, with expert data particularly important for predicting low-quality revisions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.