TL;DR
This study compares human and large language model proofreading in second language writing, analyzing their effects on lexical and syntactic features, and finds that both improve coherence, with LLMs showing more extensive rephrasing and vocabulary diversity.
Contribution
It provides a comparative analysis of human versus LLM proofreading effects on L2 writing, highlighting the generative nature and consistency of LLM interventions.
Findings
Both human and LLM proofreading improve bigram lexical features.
LLM proofreading employs more diverse vocabulary and complex sentence structures.
Proofreading outcomes are consistent across different LLMs.
Abstract
This study examines the lexical and syntactic interventions of human and LLM proofreading aimed at improving overall intelligibility in identical second language writings, and evaluates the consistency of outcomes across three LLMs (ChatGPT-4o, Llama3.1-8b, Deepseek-r1-8b). Findings show that both human and LLM proofreading enhance bigram lexical features, which may contribute to better coherence and contextual connectedness between adjacent words. However, LLM proofreading exhibits a more generative approach, extensively reworking vocabulary and sentence structures, such as employing more diverse and sophisticated vocabulary and incorporating a greater number of adjective modifiers in noun phrases. The proofreading outcomes are highly consistent in major lexical and syntactic features across the three models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
