Equilibrium (Zipf) and Dynamic (Grasseberg-Procaccia) method based analyses of human texts. A comparison of natural (english) and artificial (esperanto) languages
M. Ausloos

TL;DR
This study compares natural English and artificial Esperanto texts using Zipf and Grassberger-Procaccia methods, revealing differences in stylistic features and author creativity through power-law exponents and phase space analysis.
Contribution
It introduces a novel combined approach using Zipf and Grassberger-Procaccia analyses to quantify stylistic and creative differences in human texts and their translations.
Findings
Zipf exponents vary with sentence definition, reflecting author style.
Attractor dimension relates to phase space dimension with an exponent of 0.79.
Qualitative content differences are minimal between original and translated texts.
Abstract
A comparison of two english texts from Lewis Carroll, one (Alice in wonderland), also translated into esperanto, the other (Through a looking glass) are discussed in order to observe whether natural and artificial languages significantly differ from each other. One dimensional time series like signals are constructed using only word frequencies (FTS) or word lengths (LTS). The data is studied through (i) a Zipf method for sorting out correlations in the FTS and (ii) a Grassberger-Procaccia (GP) technique based method for finding correlations in LTS. Features are compared : different power laws are observed with characteristic exponents for the ranking properties, and the {\it phase space attractor dimensionality}. The Zipf exponent can take values much less than unity ( 0.50 or 0.30) depending on how a sentence is defined. This non-universality is conjectured to be a measure of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
