Fractal Power Law in Literary English
L.L. Goncalves, L.B. Goncalves

TL;DR
This study analyzes literary texts from 20th-century English writers, revealing that lexical wealth follows a fractal power law which can distinguish authors, genres, and text types through their unique signatures.
Contribution
It introduces a novel corpus-based method using fractal power laws to identify author and genre signatures in literary texts.
Findings
Lexical wealth follows a fractal power law in literary texts.
Authors and genres have distinct power law signatures.
The method discriminates between short stories and novels.
Abstract
We present in this paper a numerical investigation of literary texts by various well-known English writers, covering the first half of the twentieth century, based upon the results obtained through corpus analysis of the texts. A fractal power law is obtained for the lexical wealth defined as the ratio between the number of different words and the total number of words of a given text. By considering as a signature of each author the exponent and the amplitude of the power law, and the standard deviation of the lexical wealth, it is possible to discriminate works of different genres and writers and show that each writer has a very distinct signature, either considered among other literary writers or compared with writers of non-literary texts. It is also shown that, for a given author, the signature is able to discriminate between short stories and novels.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
