Integrating Fr\'echet distance and AI reveals the evolutionary trajectory and origin of SARS-CoV-2
Anyou Wang

TL;DR
This study introduces a novel alignment-free method combining Fréchet distance and AI to trace SARS-CoV-2's evolution and origin, revealing key mutational features and pathways from various animals to humans.
Contribution
It develops a new algorithm for assessing genome evolution without alignment, integrating it with neural networks to elucidate SARS-CoV-2's evolutionary trajectory and origin.
Findings
SARS-CoV-2 evolution shortens its genome to increase infectivity.
Mutating specific features like TTA and GCT boosts infectious potential.
Origin traced from mink to humans through multiple intermediate hosts.
Abstract
A genome, composed of a precisely ordered sequence of four nucleotides (ATCG), encompasses a multitude of specific genome features like AAA motif. Mutations occurring within a genome disrupt the sequential order and composition of these features, thereby influencing the evolutionary trajectories and yielding variants. The evolutionary relatedness between a variant and its ancestor can be estimated by assessing evolutionary distances across a spectrum of genome features. This study develops a novel, alignment-free algorithm that considers both the sequential order and composition of genome features, enabling computation of the Fr\'echet distance (Fr) across multiple genome features to quantify the evolutionary status of a variant. Integrating this algorithm with an artificial recurrent neural network (RNN) reveals the quantitative evolutionary trajectory and origin of SARS-CoV-2, a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomics and Phylogenetic Studies · Machine Learning in Bioinformatics · vaccines and immunoinformatics approaches
MethodsAttention Is All You Need · Linear Layer · Gated Channel Transformation · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Softmax · Residual Connection · Convolution · Adam · Multi-Head Attention
