On the class of coding optimality of human languages and the origins of Zipf's law

Ramon Ferrer-i-Cancho

arXiv:2505.20015·cs.CL·October 31, 2025

On the class of coding optimality of human languages and the origins of Zipf's law

Ramon Ferrer-i-Cancho

PDF

Open Access

TL;DR

This paper introduces a new class of coding optimality that explains Zipf's law in human languages and some animal communication systems, linking frequency distributions to coding efficiency and optimality.

Contribution

It defines a novel class of optimality for coding systems, connecting Zipf's law, size-rank, and size-probability laws, and explores their implications for human and animal communication.

Findings

01

Human languages fit the new optimality class.

02

Some animal communication systems exhibit exponential distributions.

03

Straight lines in log-log plots indicate near-optimal coding.

Abstract

Here we present a new class of optimality for coding systems. Members of that class are displaced linearly from optimal coding and thus exhibit Zipf's law, namely a power-law distribution of frequency ranks. Within that class, Zipf's law, the size-rank law and the size-probability law form a group-like structure. We identify human languages that are members of the class. All languages showing sufficient agreement with Zipf's law are potential members of the class. In contrast, there are communication systems in other species that cannot be members of that class for exhibiting an exponential distribution instead but dolphins and humpback whales might. We provide a new insight into plots of frequency versus rank in double logarithmic scale. For any system, a straight line in that scale indicates that the lengths of optimal codes under non-singular coding and under uniquely decodable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topicssemigroups and automata theory