Gene Similarity-based Approaches for Determining Core-Genes of Chloroplasts
Bassam AlKindy, Christophe Guyeux, Jean-Fran\c{c}ois Couchot, Michel, Salomon, Jacques M. Bahi

TL;DR
This paper improves methods for identifying core chloroplast genes by combining sequence and gene feature similarities from NCBI and DOGMA tools, enhancing accuracy over previous approaches.
Contribution
It introduces an improved similarity comparison method that relaxes gene name constraints and adds sequence validation, leading to higher quality core gene identification.
Findings
Enhanced core gene identification accuracy
Better clustering of genes from NCBI and DOGMA
Improved biological relevance of core gene sets
Abstract
In computational biology and bioinformatics, the manner to understand evolution processes within various related organisms paid a lot of attention these last decades. However, accurate methodologies are still needed to discover genes content evolution. In a previous work, two novel approaches based on sequence similarities and genes features have been proposed. More precisely, we proposed to use genes names, sequence similarities, or both, insured either from NCBI or from DOGMA annotation tools. Dogma has the advantage to be an up-to-date accurate automatic tool specifically designed for chloroplasts, whereas NCBI possesses high quality human curated genes (together with wrongly annotated ones). The key idea of the former proposal was to take the best from these two tools. However, the first proposal was limited by name variations and spelling errors on the NCBI side, leading to core…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
