Identification of repeats in DNA sequences using nucleotide distribution uniformity
Changchuan Yin

TL;DR
This paper introduces an ab initio method that uses nucleotide distribution uniformity to accurately identify and analyze repetitive elements and their periodicities in DNA sequences, enhancing understanding of genomic structures.
Contribution
The paper presents a novel linear-complexity approach based on nucleotide distribution uniformity for detecting and characterizing repetitive DNA elements without prior knowledge.
Findings
Effective identification of repeat patterns in various genomes
High accuracy in detecting periodicities and consensus patterns
Method complexity is linear with sequence length
Abstract
Repetitive elements are important in genomic structures, functions and regulations, yet effective methods in precisely identifying repetitive elements in DNA sequences are not fully accessible, and the relationship between repetitive elements and periodicities of genomes is not clearly understood. We present an method to quantitatively detect repetitive elements and infer the consensus repeat pattern in repetitive elements. The method uses the measure of the distribution uniformity of nucleotides at periodic positions in DNA sequences or genomes. It can identify periodicities, consensus repeat patterns, copy numbers and perfect levels of repetitive elements. The results of using the method on different DNA sequences and genomes demonstrate efficacy and accuracy in identifying repeat patterns and periodicities. The complexity of the method is linear with respect to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRNA and protein synthesis mechanisms · Fractal and DNA sequence analysis · Genomics and Phylogenetic Studies
