Comparing reverse complementary genomic words based on their distance distributions and frequencies
Ana Helena Tavares, Jakob Raymaekers, Peter Rousseeuw, Raquel, M. Silva, Carlos A.C. Bastos, Armando Pinho, Paula Brito, Vera, Afreixo

TL;DR
This study analyzes reverse complementary genomic word pairs in human DNA by comparing their distance distributions and frequencies, revealing asymmetries that extend beyond Chargaff's rules.
Contribution
It introduces a novel comparison of reverse complementary words based on distance distribution dissimilarity measures, especially peak dissimilarity, and explores their frequency relationships.
Findings
Some reverse complementary pairs have highly dissimilar distance distributions.
Other pairs show similar distributions despite irregularities and peaks.
Distribution dissimilarity correlates with frequency discrepancies.
Abstract
In this work we study reverse complementary genomic word pairs in the human DNA, by comparing both the distance distribution and the frequency of a word to those of its reverse complement. Several measures of dissimilarity between distance distributions are considered, and it is found that the peak dissimilarity works best in this setting. We report the existence of reverse complementary word pairs with very dissimilar distance distributions, as well as word pairs with very similar distance distributions even when both distributions are irregular and contain strong peaks. The association between distribution dissimilarity and frequency discrepancy is explored also, and it is speculated that symmetric pairs combining low and high values of each measure may uncover features of interest. Taken together, our results suggest that some asymmetries in the human genome go far beyond Chargaff's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
