A Gibbs approach to Chargaff's second parity rule
Andrew Hart, Servet Mart\'inez, Felipe Olmos

TL;DR
This paper proposes a probabilistic explanation for Chargaff's second parity rule based on Gibbs distribution, and develops a statistical test to validate CSPR in bacterial genomes, linking it to DNA strand complementarity.
Contribution
It introduces a Gibbsian framework to explain CSPR and provides a statistical test for its validation in genomic data, offering a theoretical basis for the rule.
Findings
CSPR can be derived from Gibbs distribution assumptions.
The statistical test supports CSPR in bacterial genomes.
The approach links DNA complementarity to probabilistic models.
Abstract
Chargaff's second parity rule (CSPR) asserts that the frequencies of short polynucleotide chains are the same as those of the complementary reversed chains. Up to now, this hypothesis has only been observed empirically and there is currently no explanation for its presence in DNA strands. Here we argue that CSPR is a probabilistic consequence of the reverse complementarity between paired strands, because the Gibbs distribution associated with the chemical energy between the bonds satisfies CSPR. We develop a statistical test to study the validity of CSPR under the Gibbsian assumption and we apply it to a large set of bacterial genomes taken from the GenBank repository.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
