There Are Fewer Facts Than Words: Communication With A Growing Complexity
{\L}ukasz D\k{e}bowski

TL;DR
This paper proves an impossibility theorem in communication systems showing that the number of words exceeds the number of independent facts, linking linguistic complexity to power-law phenomena.
Contribution
It introduces a theorem connecting the number of words and facts in finite texts, relating to Zipf's law and power-law scaling in information theory.
Findings
Number of words exceeds facts in finite texts
Theorem relates to Zipf's law and power-law phenomena
Provides bounds on information complexity in communication
Abstract
We present an impossibility result, called a theorem about facts and words, which pertains to a general communication system. The theorem states that the number of distinct words used in a finite text is roughly greater than the number of independent elementary persistent facts described in the same text. In particular, this theorem can be related to Zipf's law, power-law scaling of mutual information, and power-law-tailed learning curves. The assumptions of the theorem are: a finite alphabet, linear sequence of symbols, complexity that does not decrease in time, entropy rate that can be estimated, and finiteness of the inverse complexity rate.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFractal and DNA sequence analysis · Computability, Logic, AI Algorithms · Evolutionary Algorithms and Applications
