Complex network analysis of literary and scientific texts
Iwona Grabska-Gradzinska, Andrzej Kulig, Jaroslaw Kwapien, Stanislaw, Drozdz

TL;DR
This study analyzes the statistical and network properties of literary and scientific texts in English and Polish, revealing language-specific differences in Zipf law scaling, network structure, and hierarchy, with implications for understanding language and genre distinctions.
Contribution
It provides a comparative network analysis of literary and scientific texts across two languages, highlighting differences in Zipf scaling, network topology, and hierarchical structure.
Findings
Polish texts follow Zipf law with smaller scaling exponents than English.
Scientific texts have shorter power-law ranges than literary texts.
Most literary texts exhibit scale-free, hierarchical networks, with English texts showing more clustering.
Abstract
We present results from our quantitative study of statistical and network properties of literary and scientific texts written in two languages: English and Polish. We show that Polish texts are described by the Zipf law with the scaling exponent smaller than the one for the English language. We also show that the scientific texts are typically characterized by the rank-frequency plots with relatively short range of power-law behavior as compared to the literary texts. We then transform the texts into their word-adjacency network representations and find another difference between the languages. For the majority of the literary texts in both languages, the corresponding networks revealed the scale-free structure, while this was not always the case for the scientific texts. However, all the network representations of texts were hierarchical. We do not observe any qualitative and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
