CSynGEC: Incorporating Constituent-based Syntax for Grammatical Error Correction with a Tailored GEC-Oriented Parser
Yue Zhang, Zhenghua Li

TL;DR
CSynGEC introduces a novel constituent-based syntax approach for grammatical error correction, leveraging a tailored parser and combining syntax formalism types to enhance correction performance.
Contribution
This work extends constituent-based syntax schemes for GEC, develops a GEC-oriented constituency parser, and explores combining syntax formalism types for improved correction accuracy.
Findings
Significant improvements over strong baselines in GEC tasks.
Intra-model syntax combination enhances recall.
Inter-model syntax combination improves precision.
Abstract
Recently, Zhang et al. (2022) propose a syntax-aware grammatical error correction (GEC) approach, named SynGEC, showing that incorporating tailored dependency-based syntax of the input sentence is quite beneficial to GEC. This work considers another mainstream syntax formalism, i.e., constituent-based syntax. By drawing on the successful experience of SynGEC, we first propose an extended constituent-based syntax scheme to accommodate errors in ungrammatical sentences. Then, we automatically obtain constituency trees of ungrammatical sentences to train a GEC-oriented constituency parser by using parallel GEC data as a pivot. For syntax encoding, we employ the graph convolutional network (GCN). Experimental results show that our method, named CSynGEC, yields substantial improvements over strong baselines. Moreover, we investigate the integration of constituent-based and dependency-based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Text Readability and Simplification · Topic Modeling
