Eight-cluster structure of chloroplast genomes differs from similar one observed for bacteria
Michael Sadovsky, Maria Senashova, Andrew Malyshev

TL;DR
This study reveals an eight-cluster pattern in chloroplast genomes, differing from bacterial patterns, and associates specific clusters with tRNA genes and high GC content, challenging previous universal bacterial genome models.
Contribution
It demonstrates that chloroplast genomes exhibit a distinct eight-cluster structure, contrasting with bacterial patterns, using triplet frequency clustering and elastic map techniques.
Findings
Eight-cluster structure identified in chloroplast genomes.
Clusters associated with tRNA genes and high GC content.
Chloroplast pattern differs from bacterial and cyanobacterial patterns.
Abstract
Previously, a seven-cluster pattern claiming to be a universal one in bacterial genomes has been reported. Keeping in mind the most popular theory of chloroplast origin, we checked whether a similar pattern is observed in chloroplast genomes. Surprisingly, eight cluster structure has been found, for chloroplasts. The pattern observed for chloroplasts differs rather significantly, from bacterial one, and from that latter observed for cyanobacteria. The structure is provided by clustering of the fragments of equal length isolated within a genome so that each fragment is converted in triplet frequency dictionary with non-overlapping triplets with no gaps in frame tiling. The points in 63-dimensional space were clustered due to elastic map technique. The eight cluster found in chloroplasts comprises the fragments of a genome bearing tRNA genes and exhibiting excessively high…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · Genomics and Phylogenetic Studies · RNA and protein synthesis mechanisms
