On a chain of fragmentation equations for duplication-mutation dynamics in DNA sequences
M.V. Koroteev

TL;DR
This paper develops a hierarchy of equations to model duplication-mutation processes in DNA, explaining the power-law length distributions of duplicated sequences observed in natural DNA and validating these models with simulations.
Contribution
It introduces a new hierarchical framework for modeling exact match distributions in DNA, simplifying to pairwise equations and matching simulation results.
Findings
Power-law tail in DNA duplication length distributions
Hierarchical equations accurately model duplication-mutation dynamics
Reduced equations effectively describe pairwise exact matches
Abstract
Recent studies have revealed that for the majority of species the length distributions of duplicated sequences in natural DNA follow a power-law tail. We study duplication-mutation models for processes in natural DNA sequences and the length distributions of exact matches computed from both synthetic and natural sequences. Here we present a hierarchy of equations for various number of exact matches for these models. The reduction of these equations to one equation for pairs of exact repeats is found. Quantitative correspondence of solutions of the equation to simulations is demonstrated.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBacteriophages and microbial interactions · Evolution and Genetic Dynamics · RNA and protein synthesis mechanisms
