CoHSI I; Detailed properties of the Canonical Distribution for Discrete Systems such as the Proteome
Les Hatton, Gregory Warr

TL;DR
This paper investigates the CoHSI distribution's properties in discrete systems like proteomes, revealing global constraints on component lengths that are not explainable by local factors such as natural selection or human choice.
Contribution
It analyzes the solution of the CoHSI distribution and its implications, establishing that certain properties of component lengths are universal and inevitable in discrete systems.
Findings
Long components occur frequently across systems.
Average component length is highly conserved.
Properties are global, not due to local selection.
Abstract
The CoHSI (Conservation of Hartley-Shannon Information) distribution is at the heart of a wide-class of discrete systems, defining the length distribution of their components amongst other global properties. Discrete systems such as the known proteome where components are proteins, computer software, where components are functions and texts where components are books, are all known to fit this distribution accurately. In this short paper, we explore its solution and its resulting properties and lay the foundation for a series of papers which will demonstrate amongst other things, why the average length of components is so highly conserved and why long components occur so frequently in these systems. These properties are not amenable to local arguments such as natural selection in the case of the proteome or human volition in the case of computer software, and indeed turn out to be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Mechanics and Entropy · Complex Systems and Time Series Analysis · Complex Network Analysis Techniques
