Universal and idiosyncratic characteristic lengths in bacterial genomes
Ivan Junier, Paul Fr\'emont, Olivier Rivoire

TL;DR
This study identifies a universal bacterial genome characteristic length of about 10-20 kb, reflecting fundamental gene coordination structures, by analyzing correlations and conservation across numerous genomes.
Contribution
It introduces methods to detect universal genomic lengths despite diverse and idiosyncratic features in bacterial genomes.
Findings
A characteristic length of 10-20 kb is common across bacteria.
This length relates to gene expression coordination structures.
Methods are developed to identify universal features amid genomic diversity.
Abstract
In condensed matter physics, simplified descriptions are obtained by coarse-graining the features of a system at a certain characteristic length, defined as the typical length beyond which some properties are no longer correlated. From a physics standpoint, in vitro DNA has thus a characteristic length of 300 base pairs (bp), the Kuhn length of the molecule beyond which correlations in its orientations are typically lost. From a biology standpoint, in vivo DNA has a characteristic length of 1000 bp, the typical length of genes. Since bacteria live in very different physico-chemical conditions and since their genomes lack translational invariance, whether larger, universal characteristic lengths exist is a non-trivial question. Here, we examine this problem by leveraging the large number of fully sequenced genomes available in public databases. By analyzing GC content correlations and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
