# Automatic Classification of the Complexity of Nonfiction Texts in   Portuguese for Early School Years

**Authors:** Nathan Siegle Hartmann, Livia Cucatto, Danielle Brants and, Sandra Alu\'isio

arXiv: 1704.03013 · 2017-04-12

## TL;DR

This paper develops a classification scheme for Portuguese nonfiction texts aimed at early school readers, achieving promising accuracy levels to support tailored reading education and improve literacy skills.

## Contribution

It introduces a novel manual annotation scheme for classifying Portuguese texts into five grade levels, addressing a gap in non-English text complexity classification.

## Key findings

- 52% accuracy in 5-level classification
- 74% accuracy in 3-level classification
- Comparable to state-of-the-art English classifiers

## Abstract

Recent research shows that most Brazilian students have serious problems regarding their reading skills. The full development of this skill is key for the academic and professional future of every citizen. Tools for classifying the complexity of reading materials for children aim to improve the quality of the model of teaching reading and text comprehension. For English, Fengs work [11] is considered the state-of-art in grade level prediction and achieved 74% of accuracy in automatically classifying 4 levels of textual complexity for close school grades. There are no classifiers for nonfiction texts for close grades in Portuguese. In this article, we propose a scheme for manual annotation of texts in 5 grade levels, which will be used for customized reading to avoid the lack of interest by students who are more advanced in reading and the blocking of those that still need to make further progress. We obtained 52% of accuracy in classifying texts into 5 levels and 74% in 3 levels. The results prove to be promising when compared to the state-of-art work.9

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1704.03013/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1704.03013/full.md

## References

41 references — full list in the complete paper: https://tomesphere.com/paper/1704.03013/full.md

---
Source: https://tomesphere.com/paper/1704.03013