Source codes in human communication
Michael Ramscar

TL;DR
This paper explores the fundamental differences between human natural language communication and digital information systems, emphasizing how shared codes and distributional properties shape linguistic communication.
Contribution
It provides a novel perspective on natural language as a probabilistic system, considering the distributional properties that address differences from digital communication codes.
Findings
Languages have distributional properties that challenge traditional probabilistic models.
Natural languages rely on incomplete, shared codes rather than fully accessible codes.
These properties suggest a different view of human communication compared to digital systems.
Abstract
Although information theoretic characterizations of human communication have become increasingly popular in linguistics, to date they have largely involved grafting probabilistic constructs onto older ideas about grammar. Similarities between human and digital communication have been strongly emphasized, and differences largely ignored. However, some of these differences matter: communication systems are based on predefined codes shared by every sender-receiver, whereas the distributions of words in natural languages guarantee that no speaker-hearer ever has access to an entire linguistic code, which seemingly undermines the idea that natural languages are probabilistic systems in any meaningful sense. This paper describes how the distributional properties of languages meet the various challenges arising from the differences between information systems and natural languages, along with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and dialogue systems
