The arithmetic topology of genetic alignments
Christopher Barrett, Andrei Bura, Qijun He, Fenix Huang, Christian, Reidys

TL;DR
This paper introduces a new mathematical framework using weighted simplicial complexes to analyze genetic variation in sequence alignments, revealing novel insights into viral dynamics.
Contribution
It extends pairwise genetic relations to k-ary dissimilarities and develops a topological approach to capture complex biological interactions.
Findings
Captures new layers of viral genetic dynamics.
Applies to SARS-CoV-2 and H1N1 genomic data.
Provides a mathematical foundation for biological interpretation.
Abstract
We propose a novel mathematical paradigm for the study of genetic variation in sequence alignments. This framework originates from extending the notion of pairwise relations, upon which current analysis is based on, to k-ary dissimilarity. This dissimilarity naturally leads to a generalization of simplicial complexes by endowing simplices with weights, compatible with the boundary operator. We introduce the notion of k-stances and dissimilarity complex, the former encapsulating arithmetic as well as topological structure expressing these k-ary relations. We study basic mathematical properties of dissimilarity complexes and show how this approach captures an entirely new layer of biologically relevant viral dynamics in the context of SARS-CoV-2 and H1N1 flu genomic data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFractal and DNA sequence analysis · Machine Learning in Bioinformatics · Bioinformatics and Genomic Networks
