Universal distribution of component frequencies in biological and technological systems
Tin Yau Pang, Sergei Maslov

TL;DR
This study reveals that both biological genomes and technological software systems exhibit a universal scale-free power law distribution of component frequencies, linked to their dependency network structure.
Contribution
It demonstrates that component frequency distributions in biological and technological systems follow a universal pattern explained by dependency networks.
Findings
Both genomes and software systems follow the same power law distribution.
Component frequency correlates with dependency degree.
A simple model reproduces the observed distributions.
Abstract
Bacterial genomes and large-scale computer software projects both consist of a large number of components (genes or software packages) connected via a network of mutual dependencies. Components can be easily added or removed from individual systems and their usage frequencies vary over many orders of magnitude. We study this frequency distribution in genomes of ~500 bacterial species and in over 2 million of Linux computers and find that in both cases it is described by the same scale-free power law distribution with an additional peak near the tail of the distribution corresponding to nearly universal components. We argue that this is a general property of any modular system with a multi-layered dependency network. We demonstrate that the frequency of a component is positively correlated with its dependency degree given by the total number of upstream components whose operation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
