Graphical law beneath each written natural language

Anindya Kumar Biswas

arXiv:1307.6235·physics.gen-ph·January 22, 2020

Graphical law beneath each written natural language

Anindya Kumar Biswas

PDF

Open Access

TL;DR

This paper analyzes twenty-four natural languages by plotting normalized letter frequency distributions and finds they resemble curves of reduced magnetization versus temperature, suggesting a possible underlying magnetic analogy.

Contribution

It introduces a novel approach of comparing language letter distributions to magnetic material curves, proposing a weak conjecture of an underlying magnetization curve in languages.

Findings

01

All languages exhibit similar normalized letter frequency graphs.

02

The graphs resemble curves of reduced magnetization vs temperature.

03

A weak conjecture links language structure to magnetic phenomena.

Abstract

We study twenty four written natural languages. We draw in the log scale, number of words starting with a letter vs rank of the letter, both normalised. We find that all the graphs are of the similar type. The graphs are tantalisingly closer to the curves of reduced magnetisation vs reduced temperature for magnetic materials. We make a weak conjecture that a curve of magnetisation underlies a written natural language.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Constraint Satisfaction and Optimization · Language and cultural evolution