Is ChatGPT a Highly Fluent Grammatical Error Correction System? A   Comprehensive Evaluation

Tao Fang; Shu Yang; Kaixin Lan; Derek F. Wong; Jinpeng Hu; Lidia S.; Chao; Yue Zhang

arXiv:2304.01746·cs.CL·April 5, 2023·56 cites

Is ChatGPT a Highly Fluent Grammatical Error Correction System? A Comprehensive Evaluation

Tao Fang, Shu Yang, Kaixin Lan, Derek F. Wong, Jinpeng Hu, Lidia S., Chao, Yue Zhang

PDF

Open Access

TL;DR

This paper evaluates ChatGPT's capabilities in grammatical error correction across multiple languages and settings, revealing strengths in error detection and fluency, but also limitations in certain complex error types.

Contribution

It provides a comprehensive evaluation of ChatGPT's GEC performance using zero-shot and few-shot prompting across multilingual datasets, highlighting its strengths and weaknesses.

Findings

01

ChatGPT demonstrates excellent error detection and fluency correction.

02

It performs well in multilingual and low-resource GEC tasks.

03

It struggles with agreement, coreference, tense, and cross-sentence errors.

Abstract

ChatGPT, a large-scale language model based on the advanced GPT-3.5 architecture, has shown remarkable potential in various Natural Language Processing (NLP) tasks. However, there is currently a dearth of comprehensive study exploring its potential in the area of Grammatical Error Correction (GEC). To showcase its capabilities in GEC, we design zero-shot chain-of-thought (CoT) and few-shot CoT settings using in-context learning for ChatGPT. Our evaluation involves assessing ChatGPT's performance on five official test sets in three different languages, along with three document-level GEC test sets in English. Our experimental results and human evaluations demonstrate that ChatGPT has excellent error detection capabilities and can freely correct errors to make the corrected sentences very fluent, possibly due to its over-correction tendencies and not adhering to the principle of minimal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Text Readability and Simplification · Artificial Intelligence in Healthcare and Education

MethodsMulti-Head Attention · Attention Is All You Need · Test · Cosine Annealing · Dropout · Dense Connections · Weight Decay · Adam · Linear Layer · Layer Normalization