SynGEC: Syntax-Enhanced Grammatical Error Correction with a Tailored GEC-Oriented Parser
Yue Zhang, Bo Zhang, Zhenghua Li, Zuyi Bao, Chen Li and, Min Zhang

TL;DR
SynGEC introduces a syntax-enhanced GEC model that uses a tailored parser trained on parallel data to incorporate syntactic information, significantly improving grammatical error correction performance.
Contribution
The paper develops a GEC-oriented parser trained on parallel data and integrates dependency syntax into GEC models, addressing parser unreliability on ungrammatical sentences.
Findings
Outperforms strong baselines on English and Chinese GEC datasets
Effectively incorporates syntactic information into GEC models
Achieves competitive results in grammatical error correction
Abstract
This work proposes a syntax-enhanced grammatical error correction (GEC) approach named SynGEC that effectively incorporates dependency syntactic information into the encoder part of GEC models. The key challenge for this idea is that off-the-shelf parsers are unreliable when processing ungrammatical sentences. To confront this challenge, we propose to build a tailored GEC-oriented parser (GOPar) using parallel GEC training data as a pivot. First, we design an extended syntax representation scheme that allows us to represent both grammatical errors and syntax in a unified tree structure. Then, we obtain parse trees of the source incorrect sentences by projecting trees of the target correct sentences. Finally, we train GOPar with such projected trees. For GEC, we employ the graph convolution network to encode source-side syntactic information produced by GOPar, and fuse them with the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Dense Connections · Softmax · Position-Wise Feed-Forward Layer · Adam · Label Smoothing · Absolute Position Encodings · Layer Normalization
