DECOR: Improving Coherence in L2 English Writing with a Novel Benchmark for Incoherence Detection, Reasoning, and Rewriting
Xuanming Zhang, Anthony Diaz, Zixun Chen, Qingyang Wu, Kun Qian, Erik, Voss, Zhou Yu

TL;DR
DECOR introduces a new benchmark dataset for detecting, explaining, and rewriting incoherent sentences in L2 English writing, aiming to enhance automated coherence assessment and correction for language learners.
Contribution
The paper presents DECOR, the first coherence assessment dataset tailored for L2 English, and demonstrates fine-tuned models that improve incoherence detection and rewriting.
Findings
Incorporating reasons for incoherence improves rewrite quality.
Fine-tuned models outperform baseline in automatic and human evaluations.
DECOR dataset enables targeted coherence correction in L2 writing.
Abstract
Coherence in writing, an aspect that second-language (L2) English learners often struggle with, is crucial in assessing L2 English writing. Existing automated writing evaluation systems primarily use basic surface linguistic features to detect coherence in writing. However, little effort has been made to correct the detected incoherence, which could significantly benefit L2 language learners seeking to improve their writing. To bridge this gap, we introduce DECOR, a novel benchmark that includes expert annotations for detecting incoherence in L2 English writing, identifying the underlying reasons, and rewriting the incoherent sentences. To our knowledge, DECOR is the first coherence assessment dataset specifically designed for improving L2 English writing, featuring pairs of original incoherent sentences alongside their expert-rewritten counterparts. Additionally, we fine-tuned models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsText Readability and Simplification
