Graphical law beneath each written natural language
Anindya Kumar Biswas

TL;DR
This paper analyzes twenty-four natural languages by plotting normalized letter frequency distributions and finds they resemble curves of reduced magnetization versus temperature, suggesting a possible underlying magnetic analogy.
Contribution
It introduces a novel approach of comparing language letter distributions to magnetic material curves, proposing a weak conjecture of an underlying magnetization curve in languages.
Findings
All languages exhibit similar normalized letter frequency graphs.
The graphs resemble curves of reduced magnetization vs temperature.
A weak conjecture links language structure to magnetic phenomena.
Abstract
We study twenty four written natural languages. We draw in the log scale, number of words starting with a letter vs rank of the letter, both normalised. We find that all the graphs are of the similar type. The graphs are tantalisingly closer to the curves of reduced magnetisation vs reduced temperature for magnetic materials. We make a weak conjecture that a curve of magnetisation underlies a written natural language.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Constraint Satisfaction and Optimization · Language and cultural evolution
