Generalized Hurst exponent and multifractal function of original and translated texts mapped into frequency and length time series
Marcel Ausloos

TL;DR
This study applies nonlinear dynamics and multifractal analysis to written texts and their translations, revealing differences in complexity and multiscale features between original and translated texts using frequency and length time series.
Contribution
It introduces a novel approach using generalized Hurst exponents and multifractal functions to analyze and compare original and translated texts mapped into frequency and length time series.
Findings
Original texts show non-parabolic multifractal spectra.
Translated texts exhibit more extreme multifractal values.
Shuffled texts serve as baseline for complexity comparison.
Abstract
A nonlinear dynamics approach can be used in order to quantify complexity in written texts. As a first step, a one-dimensional system is examined : two written texts by one author (Lewis Carroll) are considered, together with one translation, into an artificial language, i.e. Esperanto are mapped into time series. Their corresponding shuffled versions are used for obtaining a "base line". Two different one-dimensional time series are used here: (i) one based on word lengths (LTS), (ii) the other on word frequencies (FTS). It is shown that the generalized Hurst exponent and the derived curves of the original and translated texts show marked differences. The original "texts" are far from giving a parabolic function, - in contrast to the shuffled texts. Moreover, the Esperanto text has more extreme values. This suggests cascade model-like, with multiscale…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
