Comparison of Grammatical Error Correction Using Back-Translation Models
Aomi Koyama, Kengo Hotate, Masahiro Kaneko, Mamoru Komachi

TL;DR
This paper compares how different back-translation models affect grammatical error correction, showing that combining pseudo data from various models enhances correction performance across error types.
Contribution
It systematically analyzes the correction tendencies of GEC models trained on pseudo data from Transformer, CNN, and LSTM back-translation models, and demonstrates the benefits of combining these data sources.
Findings
Correction tendencies vary by BT model architecture.
Combining pseudo data improves F0.5 scores.
Different BT models complement each other effectively.
Abstract
Grammatical error correction (GEC) suffers from a lack of sufficient parallel data. Therefore, GEC studies have developed various methods to generate pseudo data, which comprise pairs of grammatical and artificially produced ungrammatical sentences. Currently, a mainstream approach to generate pseudo data is back-translation (BT). Most previous GEC studies using BT have employed the same architecture for both GEC and BT models. However, GEC models have different correction tendencies depending on their architectures. Thus, in this study, we compare the correction tendencies of the GEC models trained on pseudo data generated by different BT models, namely, Transformer, CNN, and LSTM. The results confirm that the correction tendencies for each error type are different for every BT model. Additionally, we examine the correction tendencies when using a combination of pseudo data generated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Softmax · Label Smoothing · Layer Normalization · Residual Connection · Byte Pair Encoding
