Word frequency-rank relationship in tagged texts
A. Chacoma, D. H. Zanette

TL;DR
This study investigates how the frequency-rank relationship varies across different grammatical classes in English literary texts, revealing significant differences linked to linguistic features.
Contribution
It introduces an analysis of frequency-rank distributions for grammatical classes, highlighting their distinct patterns and linguistic implications.
Findings
Significant differences in frequency-rank relationships among grammatical classes
Frequency distributions reflect linguistic features of grammatical roles
Statistical analysis supports non-uniform distribution across classes
Abstract
We analyze the frequency-rank relationship in sub-vocabularies corresponding to three different grammatical classes (nouns, verbs, and others) in a collection of literary works in English, whose words have been automatically tagged according to their grammatical role. Comparing with a null hypothesis which assumes that words belonging to each class are uniformly distributed across the frequency-ranked vocabulary of the whole work, we disclose statistically significant differences between the three classes. This results point to the fact that frequency-rank relationships may reflect linguistic features associated with grammatical function.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
