Computational prediction of replication sites in DNA sequences using complex number representation
Shubham Kundal, Raunak Lohiya, Hritik Bansal, Shreya Johri, Varuni, Sarwal, Kushal Shah

TL;DR
This paper introduces a novel complex correlation method (iCorr) that enhances the prediction of DNA replication origins by converting sequences into complex numbers, providing higher resolution and eliminating the need for manual graph inspection.
Contribution
The paper extends auto-correlation to a complex number-based approach, improving resolution and automation in predicting replication sites in DNA sequences.
Findings
iCorr outperforms traditional auto-correlation in resolution.
The method accurately identifies replication origins without manual graph analysis.
Applicable to both prokaryotic and eukaryotic genomes.
Abstract
Computational prediction of origin of replication (ORI) has been of great interest in bioinformatics and several methods including GC-skew, auto-correlation etc. have been explored in the past. In this paper, we have extended the auto-correlation method to predict ORI location with much higher resolution for prokaryotes and eukaryotes, which can be very helpful in experimental validation of the computational predictions. The proposed complex correlation method (iCorr) converts the genome sequence into a sequence of complex numbers by mapping the nucleotides to {+1,-1,+i,-i} instead of {+1,-1} used in the auto-correlation method (here, i is square root of -1). Thus, the iCorr method exploits the complete spatial information about the positions of all the four nucleotides unlike the earlier auto-correlation method which uses the positional information of only one nucleotide. Also, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRNA and protein synthesis mechanisms · Fractal and DNA sequence analysis · Genomics and Phylogenetic Studies
