A Proposal of Automatic Error Correction in Text

Wulfrano A. Luna-Ram\'irez; Carlos R. Jaimez-Gonz\'alez

arXiv:2112.01846·cs.CL·December 6, 2021

A Proposal of Automatic Error Correction in Text

Wulfrano A. Luna-Ram\'irez, Carlos R. Jaimez-Gonz\'alez

PDF

Open Access

TL;DR

This paper presents an automatic error correction system for electronic texts, combining error detection, candidate correction generation, and selection, using linguistic and statistical methods for Spanish to improve text accuracy for various applications.

Contribution

It introduces a novel approach integrating part-of-speech tagging, word similarity, dictionaries, and n-gram language models for Spanish text correction.

Findings

01

Effective error detection and correction in Spanish texts

02

Improved accuracy over baseline methods

03

Potential applications in OCR and text processing

Abstract

The great amount of information that can be stored in electronic media is growing up daily. Many of them is got mainly by typing, such as the huge of information obtained from web 2.0 sites; or scaned and processing by an Optical Character Recognition software, like the texts of libraries and goverment offices. Both processes introduce error in texts, so it is difficult to use the data for other purposes than just to read it, i.e. the processing of those texts by other applications like e-learning, learning of languages, electronic tutorials, data minning, information retrieval and even more specialized systems such as tiflologic software, specifically blinded people-oriented applications like automatic reading, where the text would be error free as possible in order to make easier the text to speech task, and so on. In this paper it is showed an application of automatic recognition and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Text Readability and Simplification