Towards Lithuanian grammatical error correction

Lukas Stankevi\v{c}ius; Mantas Luko\v{s}evi\v{c}ius

arXiv:2203.09963·cs.CL·March 21, 2022

Towards Lithuanian grammatical error correction

Lukas Stankevi\v{c}ius, Mantas Luko\v{s}evi\v{c}ius

PDF

Open Access 1 Repo

TL;DR

This paper develops a Lithuanian grammatical error correction model using transformer architectures, achieving high accuracy and providing open-source code for further use and development.

Contribution

It introduces the first transformer-based grammatical error correction model for Lithuanian, comparing subword and byte-level approaches.

Findings

01

Achieved F₀.5=0.92 on Lithuanian error correction

02

Compared subword and byte-level transformer models

03

Provided open-source code for the model

Abstract

Everyone wants to write beautiful and correct text, yet the lack of language skills, experience, or hasty typing can result in errors. By employing the recent advances in transformer architectures, we construct a grammatical error correction model for Lithuanian, the language rich in archaic features. We compare subword and byte-level approaches and share our best trained model, achieving F $_{0.5}$ =0.92, and accompanying code, in an online open-source repository.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lukasstankevicius/towards-lithuanian-grammatical-error-correction
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Network Packet Processing and Optimization