The origins of Zipf's meaning-frequency law

Ramon Ferrer-i-Cancho; Michael S. Vitevitch

arXiv:1801.00168·cs.CL·September 24, 2020

The origins of Zipf's meaning-frequency law

Ramon Ferrer-i-Cancho, Michael S. Vitevitch

PDF

TL;DR

This paper explains the origin of Zipf's meaning-frequency law by deriving it from a single probabilistic assumption, linking word frequency and meanings through a biased mental exploration process.

Contribution

It introduces a unified probabilistic framework that derives Zipf's law of meaning-frequency from a single assumption, simplifying previous models.

Findings

01

The law can be derived from a single joint probability assumption.

02

A biased random walk in mental exploration explains the relationship.

03

The approach generalizes previous power law assumptions.

Abstract

In his pioneering research, G. K. Zipf observed that more frequent words tend to have more meanings, and showed that the number of meanings of a word grows as the square root of its frequency. He derived this relationship from two assumptions: that words follow Zipf's law for word frequencies (a power law dependency between frequency and rank) and Zipf's law of meaning distribution (a power law dependency between number of meanings and rank). Here we show that a single assumption on the joint probability of a word and a meaning suffices to infer Zipf's meaning-frequency law or relaxed versions. Interestingly, this assumption can be justified as the outcome of a biased random walk in the process of mental exploration.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.