On the Contribution of Discourse Structure on Text Complexity Assessment
Elnaz Davoodi, Leila Kosseim

TL;DR
This study examines how discourse features, especially coherence, significantly impact text complexity assessment, showing that coherence features are most correlated with and discriminative of text complexity across different datasets.
Contribution
The paper introduces a comparative analysis of discourse, surface, lexical, and syntactic features in text complexity assessment, highlighting the dominant role of coherence features.
Findings
Coherence features are more correlated with text complexity than other features.
Coherence features are the most discriminative in feature selection.
Results are consistent across two different datasets.
Abstract
This paper investigates the influence of discourse features on text complexity assessment. To do so, we created two data sets based on the Penn Discourse Treebank and the Simple English Wikipedia corpora and compared the influence of coherence, cohesion, surface, lexical and syntactic features to assess text complexity. Results show that with both data sets coherence features are more correlated to text complexity than the other types of features. In addition, feature selection revealed that with both data sets the top most discriminating feature is a coherence feature.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
