GRAMEP: an alignment-free method based on the Maximum Entropy Principle for identifying SNPs
Matheus Henrique Pimenta-Zanon, Andr\'e Yoshiaki Kashiwabara, Andr\'e, Lu\'is Laforga Vanzela, Fabricio Martins Lopes

TL;DR
GRAMEP is an alignment-free, maximum entropy-based method for accurate SNP detection and classification in genomic sequences, offering high efficiency and lower computational costs compared to traditional alignment-based approaches.
Contribution
This paper introduces GRAMEP, a novel alignment-free approach utilizing maximum entropy to identify SNPs and classify sequences without organism-specific data.
Findings
High accuracy in viral genome analysis
Effective SNP detection without sequence alignment
Lower computational cost compared to traditional methods
Abstract
Background: Advances in high throughput sequencing technologies provide a huge number of genomes to be analyzed. Thus, computational methods play a crucial role in analyzing and extracting knowledge from the data generated. Investigating genomic mutations is critical because of their impact on chromosomal evolution, genetic disorders, and diseases. It is common to adopt aligning sequences for analyzing genomic variations. However, this approach can be computationally expensive and restrictive in scenarios with large datasets. Results: We present a novel method for identifying single nucleotide polymorphisms (SNPs) in DNA sequences from assembled genomes. This study proposes GRAMEP, an alignment-free approach that adopts the principle of maximum entropy to discover the most informative k-mers specific to a genome or set of sequences under investigation. The informative k-mers enable the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Bioinformatics · RNA and protein synthesis mechanisms · Genomics and Phylogenetic Studies
MethodsSparse Evolutionary Training
