Topology Properties of Written Human Language
B. R. Gadjiev, T. B. Progulova

TL;DR
This study analyzes the topology of word networks in human language, showing they follow Tsallis distributions and revealing invariant properties in translations of literary texts.
Contribution
It demonstrates that word network degree distributions follow Tsallis distributions and identifies invariant nonextensivity parameters across translations.
Findings
Degree distributions follow Tsallis distribution.
Nonextensivity parameters are invariant under translation.
Translation preserves the order of nonextensivity parameters.
Abstract
We use the extended Barabasi model without the rewired process and show that the degree distribution for the corresponding networks is the Tsallis distribution. We offer an analysis of the novel "The Sound and the Fury" by W. Faulkner in English and in Russian, and show that the degree distributions of the relevant word networks are described with the Tsallis distribution. We have constructed degree distributions for each of the relevant word networks and defined the value of the nonextensivity parameter with the maximum likelihood method. For the novel text in English qB = 1.57; qK = 1.49; qJ = 1.53; qA = 1.47; qT = 1.54, and for the translation into Russian qB = 1.50; qK = 1.42; qD = 1.46; qA = 1.40; qT = 1.47. Therefore, if the translation of the novel is regarded as mapping, the nonextensivity parameters ordering qB > qT > qD > qK > qA is an invariant of this mapping.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFractal and DNA sequence analysis · Image Retrieval and Classification Techniques · Topological and Geometric Data Analysis
