Optimizing Readability Using Genetic Algorithms

Jorge Martinez-Gil

arXiv:2301.00374·cs.CL·April 15, 2024·1 cites

Optimizing Readability Using Genetic Algorithms

Jorge Martinez-Gil

PDF

Open Access 1 Repo

TL;DR

This paper introduces ORUGA, a genetic algorithm-based method to automatically enhance English text readability by optimizing factors like word choice while preserving content and structure.

Contribution

The study presents a novel multi-objective genetic algorithm approach for text readability optimization that maintains original content and form.

Findings

01

Successfully optimized readability across diverse texts

02

Preserved semantic content and syntactic structure

03

Demonstrated effectiveness through extensive testing

Abstract

This research presents ORUGA, a method that tries to automatically optimize the readability of any text in English. The core idea behind the method is that certain factors affect the readability of a text, some of which are quantifiable (number of words, syllables, presence or absence of adverbs, and so on). The nature of these factors allows us to implement a genetic learning strategy to replace some existing words with their most suitable synonyms to facilitate optimization. In addition, this research seeks to preserve both the original text's content and form through multi-objective optimization techniques. In this way, neither the text's syntactic structure nor the semantic content of the original message is significantly distorted. An exhaustive study on a substantial number and diversity of texts confirms that our method was able to optimize the degree of readability in all cases…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jorge-martinez-gil/oruga
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText Readability and Simplification · Natural Language Processing Techniques · Topic Modeling