Bangla Grammatical Error Detection Leveraging Transformer-based Token   Classification

Shayekh Bin Islam; Ridwanul Hasan Tanvir; Sihat Afnan

arXiv:2411.08344·cs.CL·November 14, 2024

Bangla Grammatical Error Detection Leveraging Transformer-based Token Classification

Shayekh Bin Islam, Ridwanul Hasan Tanvir, Sihat Afnan

PDF

Open Access

TL;DR

This paper presents a transformer-based token classification approach for Bangla grammatical error detection, combining model outputs with rule-based post-processing to improve accuracy in a low-resource language.

Contribution

It introduces a novel application of transformer models for Bangla grammatical error detection and combines them with rule-based methods for enhanced performance.

Findings

01

Achieved a Levenshtein distance score of 1.04

02

Evaluated on over 25,000 texts from various sources

03

Provided detailed analysis of system components

Abstract

Bangla is the seventh most spoken language by a total number of speakers in the world, and yet the development of an automated grammar checker in this language is an understudied problem. Bangla grammatical error detection is a task of detecting sub-strings of a Bangla text that contain grammatical, punctuation, or spelling errors, which is crucial for developing an automated Bangla typing assistant. Our approach involves breaking down the task as a token classification problem and utilizing state-of-the-art transformer-based models. Finally, we combine the output of these models and apply rule-based post-processing to generate a more reliable and comprehensive result. Our system is evaluated on a dataset consisting of over 25,000 texts from various sources. Our best model achieves a Levenshtein distance score of 1.04. Finally, we provide a detailed analysis of different components of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEducational Technology and Assessment · Handwritten Text Recognition Techniques · Hand Gesture Recognition Systems