Exploring the Feasibility of Multilingual Grammatical Error Correction with a Single LLM up to 9B parameters: A Comparative Study of 17 Models

Dawid Wisniewski; Antoni Solarski; Artur Nowakowski

arXiv:2505.06004·cs.CL·May 12, 2025

Exploring the Feasibility of Multilingual Grammatical Error Correction with a Single LLM up to 9B parameters: A Comparative Study of 17 Models

Dawid Wisniewski, Antoni Solarski, Artur Nowakowski

PDF

Open Access 1 Repo

TL;DR

This study evaluates 17 multilingual language models, including Gemma 9B, for grammatical error correction across English, German, Italian, and Swedish, highlighting the best performers and common issues.

Contribution

It provides a comparative analysis of 17 models for multilingual grammatical error correction, identifying the top-performing model Gemma 9B and analyzing common challenges.

Findings

01

Gemma 9B outperforms other models in all four languages.

02

Six models improve grammatical correctness across languages.

03

Models tend to make small, targeted corrections.

Abstract

Recent language models can successfully solve various language-related tasks, and many understand inputs stated in different languages. In this paper, we explore the performance of 17 popular models used to correct grammatical issues in texts stated in English, German, Italian, and Swedish when using a single model to correct texts in all those languages. We analyze the outputs generated by these models, focusing on decreasing the number of grammatical errors while keeping the changes small. The conclusions drawn help us understand what problems occur among those models and which models can be recommended for multilingual grammatical error correction tasks. We list six models that improve grammatical correctness in all four languages and show that Gemma 9B is currently the best performing one for the languages considered.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

laniqo-public/grammar-data-mtsummit25
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Text Readability and Simplification · Topic Modeling