A Machine Learning Approach to Persian Text Readability Assessment Using   a Crowdsourced Dataset

Hamid Mohammadi; Seyed Hossein Khasteh

arXiv:1810.06639·cs.CL·April 23, 2020

A Machine Learning Approach to Persian Text Readability Assessment Using a Crowdsourced Dataset

Hamid Mohammadi, Seyed Hossein Khasteh

PDF

TL;DR

This paper introduces the first Persian text readability assessment dataset and a machine learning model, demonstrating high accuracy and potential applications in education and healthcare.

Contribution

It presents the first Persian dataset and machine learning model for text readability assessment, filling a significant research gap.

Findings

01

Model achieved high accuracy in Persian text readability assessment

02

First dataset for Persian text readability created

03

Potential applications in medical and educational texts

Abstract

An automated approach to text readability assessment is essential to a language and can be a powerful tool for improving the understandability of texts written and published in that language. However, the Persian language, which is spoken by over 110 million speakers, lacks such a system. Unlike other languages such as English, French, and Chinese, very limited research studies have been carried out to build an accurate and reliable text readability assessment system for the Persian language. In the present research, the first Persian dataset for text readability assessment was gathered and the first model for Persian text readability assessment using machine learning was introduced. The experiments showed that this model was accurate and could assess the readability of Persian texts with a high degree of confidence. The results of this study can be used in a number of applications such…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.