A Phylogenetic Trees Analysis of SARS-CoV-2
Chen Shen, Vic Patrangenaru, Roland Moore

TL;DR
This paper explores the topology of phylogenetic tree spaces and applies these concepts to analyze SARS-CoV-2 RNA sequences, using statistical methods on tree data with three and four leaves.
Contribution
It provides an elementary proof of intrinsic mean properties on tree spaces and introduces a novel approach to analyze viral sequences through tree space topology.
Findings
Spaces of fixed-leaf trees are contractible.
Intrinsic means on spiders are sticky.
RNA sequence analysis via tree space statistics.
Abstract
One regards spaces of trees as stratified spaces, to study distributions of phylogenetic trees. Stratified spaces with may have cycles, however spaces of trees with a fixed number of leafs are contractible. Spaces of trees with three leafs, in particular, are spiders with three legs. One gives an elementary proof of the stickiness of intrinsic sample means on spiders. One also represents four leafs tree data in terms of an associated Petersen graph. One applies such ideas to analyze RNA sequences of SARS-CoV-2 from multiple sources, by building samples of trees and running nonparametric statistics for intrinsic means on tree spaces with three and four leafs. SARS-CoV-2 are also used to built trees with leaves consisting in addition to other related coronaviruses.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsZoonotic diseases and public health · Bioinformatics and Genomic Networks · COVID-19 epidemiological studies
